discourse/script/bulk_import
Leonardo Mosquera bfecbde837
Fixes for vBulletin bulk importer (#17618)
* Allow taking table prefix from env var

* FIX: remove unused column references

The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.

* FIX: report progress for tables without imported_id

* FIX: effectively check for AR validation errors

NOTE: other migration scripts also have this problem; see /t/58202

* FIX: properly count Posts when importing attachments

* FIX: improve logging

* Remove leftover comment

* FIX: show progress when exporting Permalink file

* PERF: stream Permalink file

The current way results in tons of memory usage; write once per line instead

* Document fixes needed

* WIP - deduplicate category names

* Ignore non alphanumeric chars for grouping

* FIX: properly deduplicate user emails by merging accounts

* FIX: don't merge empty UserEmails

* Improve logging

* Merge users AFTER fixing primary key sequences

* Parallelize user merging

* Save duplicated users structure for debugging purposes

* Add progress logging for the (multiple hour) user merging step
2022-11-28 16:30:19 -03:00
..
base.rb Fixes for vBulletin bulk importer (#17618) 2022-11-28 16:30:19 -03:00
discourse_merger.rb FIX: Use proper ActiveRecord method in import scripts 2022-05-09 11:09:27 +02:00
phpbb_postgresql.rb DEV: Bulk imports should find existing users by email (#14468) 2021-09-29 00:20:06 +02:00
vanilla.rb DEV: Fix methods removed in Ruby 3.2 (#15459) 2022-01-05 18:45:08 +01:00
vbulletin5.rb DEV: Fix methods removed in Ruby 3.2 (#15459) 2022-01-05 18:45:08 +01:00
vbulletin.rb Fixes for vBulletin bulk importer (#17618) 2022-11-28 16:30:19 -03:00