Commit Graph

650 Commits

Author SHA1 Message Date
Michael Sandler
4e7e6c339f
FIX: Mbox import script tried to modify frozen string (#27768) 2024-07-11 23:22:13 +02:00
Loïc Guitaut
2a28cda15c DEV: Update to lastest rubocop-discourse 2024-05-27 18:06:14 +02:00
carehabit
11877f3b9c
DEV: remove repetitive words (#26439) 2024-04-01 06:23:21 +08:00
Natalie Tay
9bc78625af
FIX: Enforce proper max for clean_orphan_uploads_grace_period_hours (#25235)
* FIX: Enforce proper max for clean_orphan_uploads_grace_period_hours

* Cast

* Set clean_orphan_uploads_grace_period_hours to max allowed
2024-01-15 10:32:07 +08:00
Gerhard Schlager
d7601388e5
DEV: Apply code format to import script (#25063) 2023-12-28 21:25:29 +01:00
Sebastian Wagner
050a285f40
FEATURE: Import Script for Fusionforge (#22281)
This is an import script for the forum/development platform https://www.fusionforge.org/projects/fusionforge
imports users, forums and posts including attachments
2023-12-28 20:36:30 +01:00
Jarek Radosz
6d7dd658a4
DEV: Update rubocop-discourse to 3.6.0 (#24945) 2023-12-18 13:44:36 +01:00
Kelv
2477bcc32e
DEV: lint against Layout/EmptyLineBetweenDefs (#24914) 2023-12-15 23:46:04 +08:00
Jarek Radosz
694b5f108b
DEV: Fix various rubocop lints (#24749)
These (21 + 3 from previous PRs) are soon to be enabled in rubocop-discourse:

Capybara/VisibilityMatcher
Lint/DeprecatedOpenSSLConstant
Lint/DisjunctiveAssignmentInConstructor
Lint/EmptyConditionalBody
Lint/EmptyEnsure
Lint/LiteralInInterpolation
Lint/NonLocalExitFromIterator
Lint/ParenthesesAsGroupedExpression
Lint/RedundantCopDisableDirective
Lint/RedundantRequireStatement
Lint/RedundantSafeNavigation
Lint/RedundantStringCoercion
Lint/RedundantWithIndex
Lint/RedundantWithObject
Lint/SafeNavigationChain
Lint/SafeNavigationConsistency
Lint/SelfAssignment
Lint/UnreachableCode
Lint/UselessMethodDefinition
Lint/Void

Previous PRs:
Lint/ShadowedArgument
Lint/DuplicateMethods
Lint/BooleanSymbol
RSpec/SpecFilePathSuffix
2023-12-06 23:25:00 +01:00
Jarek Radosz
24532653e6
FIX: A typo bug in an import script (#24553) 2023-11-25 18:10:42 +01:00
David Taylor
8a5d97ef3f
DEV: Update importers from PostUpload to UploadReference (#23681)
Discourse stopped using PostUpload in 9db8f00b3d. Since then, these importers have been writing to the table, but any data was totally unused. This commit updates the easy cases to use UploadReference, and adds an error to the discourse_merger import script, which needs more significant work.
2023-09-27 15:01:04 +01:00
Jarek Radosz
94649565ce
DEV: Correct Style/RedundantReturn rubocop issues (#23052) 2023-08-10 02:03:38 +02:00
Penar Musaraj
26a8e7da28
DEV: Remove redundant case in import script (#22882) 2023-07-31 14:52:06 -04:00
Selase Krakani
c90e399003
FIX: user_id arg override in Slack import (#22713)
Fixes `user_id` arg override during mention parsing.
2023-07-20 11:22:43 +00:00
Gerhard Schlager
e09ce99884
DEV: Slack import script (#22386)
It's very simple import script and currently imports only the following content:
* Users
* Messages as Discourse topics/posts
* Attachments

Each channel can be mapped to a category and tags. It uses regular expressions to convert formatted messages ("rich text") into Markdown used by Discourse. In the future we could convert the `blocks` attribute from each message into Markdown instead of applying regular expressions on the `text` attribute.
2023-07-04 21:37:45 +02:00
Leonardo Mosquera
c83914e2e5
FIX: fix normalize_raw method for nil inputs in migration scripts (#22304)
Various migration scripts define a normalize_raw method to do custom processing of post contents before storing it in the Post.raw and other fields.

They normally do not handle nil inputs, but it's a relatively common occurrence in data dumps.

Since this method is used from various points in the migration script, as it stands, the experience of using a migration script is that it will fail multiple times at different points, forcing you to fix the data or apply logic hacks every time then restarting.

This PR generalizes handling of nil input by returning a <missing> string.

Pros:

    no more messy repeated crashes + restarts
    consistency

Cons:

    it might hide data issues
        OTOH we can't print a warning on that method because it will flood the console since it's called from inside loops.

* FIX: zendesk import script: support nil inputs in normalize_raw
* FIX: return '<missing>' instead of empty string; do it for all methods
2023-06-29 13:22:47 -03:00
Constanza
911539c1b2
FIX: Removing arbitrary limit in a Discuz importer script query (#21686) 2023-05-23 17:07:09 -04:00
Gerhard Schlager
b24c35d887
DEV: phpBB3 importer should get quoted username from actual post (#20979)
The actual code in the import script didn't match the expected behavior as defined by the specs. This fixes it and is a follow-up to ad32fa56
2023-04-05 15:43:20 +02:00
ftc2
ad32fa5690
FIX: phpBB3 importer created invalid quote for posts (#20646)
for proper quotes (those including a valid reference to a source post), the
importer failed to yield a username, and the imported quote was broken.
2023-04-05 13:50:01 +02:00
Jarek Radosz
29e2e3ff3b
DEV: Fix random typos (#20937) 2023-04-03 19:27:32 +02:00
Bianca Nenciu
ccb345bd88
FEATURE: Update topic/comment embedding parameters (#20181)
This commit implements many changes to topic and comments embedding. It
deprecates the class_name field from EmbeddableHost and suggests using
the className parameter. discourse_username parameter has been
deprecated and it will fetch it from embedded site from the author or
discourse-username meta.

See the updated code sample from Admin > Customize > Embedding page.

* FEATURE: Add className parameter for Discourse embed

* DEV: Hide class_name from EmbeddableHost

* DEV: Deprecate class_name field of EmbeddableHost

* FEATURE: Use either author or discourse-username meta tag

* DEV: Deprecate discourse_username parameter

* DEV: Improve embed code sample
2023-02-28 14:31:59 +02:00
Jarek Radosz
f63d18d6ca
DEV: Update syntax_tree to 6.0.1 (#20466) 2023-02-27 17:20:00 +01:00
Loïc Guitaut
f7c57fbc19 DEV: Enable unless cops
We discussed the use of `unless` internally and decided to enforce
available rules from rubocop to restrict its most problematic uses.
2023-02-21 10:30:48 +01:00
Ted Johansson
25a226279a
DEV: Replace #pluck_first freedom patch with AR #pick in core (#19893)
The #pluck_first freedom patch, first introduced by @danielwaterworth has served us well, and is used widely throughout both core and plugins. It seems to have been a common enough use case that Rails 6 introduced it's own method #pick with the exact same implementation. This allows us to retire the freedom patch and switch over to the built-in ActiveRecord method.

There is no replacement for #pluck_first!, but a quick search shows we are using this in a very limited capacity, and in some cases incorrectly (by assuming a nil return rather than an exception), which can quite easily be replaced with #pick plus some extra handling.
2023-02-13 12:39:45 +08:00
GeckoLinux
d1e844841d
Fix occasional bug in order of imported comments (#20204)
This bug is actually a Drupal issue where some edited posts have their `created` and `changed` timestamps set to the same value. But even when that happens in Drupal it still maintains the correct post order in an affected thread. This PR makes the Discourse importer also maintain the original Drupal comment order by sorting comments in the source DB by their `cid`, which is sequential and never changes. More details from this post onward:
https://meta.discourse.org/t/large-drupal-forum-migration-importer-errors-and-limitations/246939/24?u=rahim123
2023-02-08 22:20:46 -05:00
Gerhard Schlager
c3e978ada9
DEV: Add import script for Yammer (#20074)
Co-authored-by: Jay Pfaffman <jay@literatecomputing.com>
2023-01-31 10:12:01 +01:00
David Taylor
436b3b392b
DEV: Apply syntax_tree formatting to script/* 2023-01-09 11:13:22 +00:00
David Taylor
d5491b13f5
DEV: Fix syntax/formatting in xenforo import script (#19761)
Followup to 7dfe85fc
2023-01-05 12:47:05 +00:00
GeckoLinux
cc5b4cd49a
FIX: change drupal permalink creation to use /node/
Drupal URL scheme for nodes begins with `/node/` , not `/topic/` .
2022-12-02 16:03:00 +11:00
Alan Guo Xiang Tan
7c321d3aad
PERF: Update Group#user_count counter cache outside DB transaction (#19256)
While load testing our user creation code path in production, we
identified that executing the DB statement to update the `Group#user_count` column within a
transaction is creating a bottleneck for us. This is because the
creation of a user and addition of the user to the relevant groups are
done in a transaction. When we execute the DB statement to update
`Group#user_count` for the relevant group, a row level lock is held
until the transaction completes. This row level lock acts like a global
lock when the server is creating users that will be added to the same
group in quick succession.

Instead of updating the counter cache within a transaction which the
default ActiveRecord `counter_cache` option does, we simply update the
counter cache outside of the committing transaction.

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2022-11-30 11:52:08 -03:00
Leonardo Mosquera
bfecbde837
Fixes for vBulletin bulk importer (#17618)
* Allow taking table prefix from env var

* FIX: remove unused column references

The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.

* FIX: report progress for tables without imported_id

* FIX: effectively check for AR validation errors

NOTE: other migration scripts also have this problem; see /t/58202

* FIX: properly count Posts when importing attachments

* FIX: improve logging

* Remove leftover comment

* FIX: show progress when exporting Permalink file

* PERF: stream Permalink file

The current way results in tons of memory usage; write once per line instead

* Document fixes needed

* WIP - deduplicate category names

* Ignore non alphanumeric chars for grouping

* FIX: properly deduplicate user emails by merging accounts

* FIX: don't merge empty UserEmails

* Improve logging

* Merge users AFTER fixing primary key sequences

* Parallelize user merging

* Save duplicated users structure for debugging purposes

* Add progress logging for the (multiple hour) user merging step
2022-11-28 16:30:19 -03:00
communiteq
7dfe85fcc7
DEV: Xenforo importer improvements (#18457)
* Fix: make expressions non-greedy
* Feature: import Xenforo avatars
* Feature: import Xenforo likes
* Feature: import Xenforo private messages
* Feature: Xenforo create permalinks
* Feature: Xenforo migrate view counts
* Fix: Xenforo list regexes
* Fix: Xenforo import all attachments
2022-11-28 16:42:39 +01:00
Pierre Ozoux
9e9235ca62
FEATURE: Add import script for Elgg (#19140) 2022-11-28 16:28:08 +01:00
Constanza
067c4deb4c
Fix comment to include phpbb 3.3, which is now supported (#18006) 2022-08-19 16:42:32 -04:00
Constanza
ef842a4b29
FEATURE: Adding a simple CSV importer (#17993) 2022-08-19 13:09:30 -04:00
Constanza
8836c8bcdf
FIX: the phpbbb import script was not parsing youtube tags (#17787) 2022-08-05 15:20:32 -04:00
communiteq
603f36ca4a
DEV: Support phpBB 3.3 imports (#17641)
* handle polls with duplicate items
* handle polls with incorrect poll_option_total values
* handle group IDs in personal messages
* support for version 3.3
2022-07-25 22:07:03 +02:00
Jay Pfaffman
7ab5dcf82f
FEATURE: my_bb import supports avatars (#17617) 2022-07-25 15:22:25 +02:00
Constanza
b9ac8e5748
Adding 3.2 to the versions of phpbb supported by the migration script (#17483) 2022-07-14 18:06:47 +05:30
Martin Brennan
fcc2e7ebbf
FEATURE: Promote polymorphic bookmarks to default and migrate (#16729)
This commit migrates all bookmarks to be polymorphic (using the
bookmarkable_id and bookmarkable_type) columns. It also deletes
all the old code guarded behind the use_polymorphic_bookmarks setting
and changes that setting to true for all sites and by default for
the sake of plugins.

No data is deleted in the migrations, the old post_id and for_topic
columns for bookmarks will be dropped later on.
2022-05-23 10:07:15 +10:00
Martin Brennan
222c8d9b6a
FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591)
A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are:

* ExportUserArchive compatibility
* SyncTopicUserBookmarked job compatibility
* Sending different notifications for the bookmark reminders based on the bookmarkable type
* Import scripts compatibility
* BookmarkReminderNotificationHandler compatibility

This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well.

Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.
2022-05-09 09:37:23 +10:00
Leonardo Mosquera
3e5faffb0d
DEV: mbox importer improvements (#16557)
* FIX: support specifying parent_category_id in mbox import metadata
* FIX: elide tabs from topic titles
* FIX: optionally fix Mailman from: addresses
* DEV: optionally elide anything up to the last = in email addresses
* Fix Mailmain broken from: detection
2022-04-29 13:24:29 -03:00
Jarek Radosz
2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Jarek Radosz
6f6406ea03
DEV: Fix random typos (#16066) 2022-02-28 10:20:58 +08:00
Michael Brown
3bf3b9a4a5 DEV: pull email address validation out to a new EmailAddressValidator
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.

Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.

Deprecated the old location in case someone is using it in a plugin.

No functionality change is in this commit.

Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
2022-02-17 21:49:22 -05:00
Gerhard Schlager
6394d7cddf
DEV: Improve phpBB3 import script (#15956)
* Optional import of custom user fields from phpBB 3.1+
* Optional import of likes from phpBB3
  Requires the phpBB "Thanks for posts" extension
* Fix import of bookmarks from phpBB3
* Update `created_at` of existing user
* Support mapping of phpBB forums to existing Discourse categories
  This is in addition to the ability of merging phpBB forums and importing into newly created Discourse categories.
2022-02-16 13:04:31 +01:00
Gerhard Schlager
33d6ed60a4
DEV: Don't import year of birth (#15937)
The cakeday plugin doesn't use the year.
2022-02-14 18:10:35 +01:00
Gerhard Schlager
6a41ec179c
FIX: Default settings for phpBB3 import were broken (#15913) 2022-02-11 18:18:54 +01:00
Canapin
ea2fd75d10
DEV: Fix some regexes in phpBB3 import script (#15829)
1. bbcode hashes don't always have exactly 8 characters.

2. colors aren't always hex values, it can be a color string ("red", "blue", etc).

3. The closing tag of smileys doesn't always include a `:` character (the start of the regex was already right for this particular issue)
2022-02-07 16:16:46 +01:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00