Commit Graph

21 Commits

Author SHA1 Message Date
lutangar
ef14cf4a5c
feat(transcription): groundwork
chore: fiddling around some more

chore: add ctranslate2 and timestamped

chore: add performance markers

chore: refactor test

chore: change worflow name

chore: ensure Python3

chore(duration): convert to chai/mocha syntahx

chore(transcription): add individual tests for others transcribers

chore(transcription): implement formats test of all implementations

Also compare result of other implementation to the reference implementation

chore(transcription): add more test case with other language and models size and local model

chore(test): wip ctranslate 2 adapat

chore(transcription): wip transcript file and benchmark

chore(test): clean a bit

chore(test): clean a bit

chore(test): refacto timestamed spec

chore(test): update workflow

chore(test): fix glob expansion with sh

chore(test): extract some hw info

chore(test): fix async tests

chore(benchmark): add model info

feat(transcription): allow use of a local mode in timestamped-whisper

feat(transcription): extract run and profiling info in own value object

feat(transcription): extract run concept in own class an run more bench

chore(transcription): somplify run object only a uuid is now needed and add more benchmark scenario

docs(transcription): creates own package readme

docs(transcription): add local model usage

docs(transcription): update README

fix(transcription): use fr video for better comparison

chore(transcription): make openai comparison passed

docs(timestamped): clea

chore(transcription): change transcribers transcribe method signature

Introduce whisper builtin model.

fix(transcription): activate language detection

Forbid transcript creation without a language.
Add `languageDetection` flag to an engine and some assertions.

Fix an issue in `whisper-ctranslate2` :
https://github.com/Softcatala/whisper-ctranslate2/pull/93

chore(transcription): use PeerTube time helpers instead of custom ones

Update existing time function to output an integer number of seconds and add a ms human-readable time formatter with hints of tests.

chore(transcription): use PeerTube UUID helpers

chore(transcription): enable CER evaluation

Thanks to this recent fix in Jiwer <3
https://github.com/jitsi/jiwer/issues/873

chore(jiwer): creates JiWer package

I'm not very happy with the TranscriptFileEvaluator constructor... suggestions ?

chore(JiWer): add usage in README

docs(jiwer): update JiWer readme

chore(transcription): use FunMOOC video in fixtures

chore(transcription): add proper english video fixture

chore(transcription): use os tmp directory where relevant

chore(transcription): fix jiwer cli test reference.txt

chore(transcription): move benchmark out of tests

chore(transcription): remove transcription workflow

docs(transcription): add benchmark info

fix(transcription): use ms precision in other transcribers

chore(transcription): simplify most of the tests

chore(transcription): remove slashes when building path with join

chore(transcription): make fromPath method async

chore(transcription): assert path to model is a directory for CTranslate2 transcriber

chore(transcription): ctranslate2 assertion

chore(transcription): ctranslate2 assertion

chore(transcription): add preinstall script for Python dependencies

chore(transcription): add download and unzip utils functions

chore(transcription): add download and unzip utils functions

chore(transcription): download & unzip models fixtures

chore(transcription): zip

chore(transcription): raise download file test timeout

chore(transcription): simplify download file test

chore(transcription): add transcriptions test to CI

chore(transcription): raise test preconditions timeout

chore(transcription): run preinstall scripts before running ci

chore(transcription): create dedicated tmp folder for transcriber tests

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): use short video for local model test

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): raise timeout some more

chore(transcription): setup verbosity based on NODE_ENV value
2024-06-28 08:43:40 +02:00
Chocobozzz
4a67994775 Use my-embed component 2024-05-30 16:31:16 +02:00
Chocobozzz
29329d6c45 Implement auto tag on comments and videos
* Comments and videos can be automatically tagged using core rules or
   watched word lists
 * These tags can be used to automatically filter videos and comments
 * Introduce a new video comment policy where comments must be approved
   first
 * Comments may have to be approved if the user auto block them using
   core rules or watched word lists
 * Implement FEP-5624 to federate reply control policies
2024-05-29 15:03:14 +02:00
Chocobozzz
5cb3e6a0b8
Use sessionId instead of IP to identify viewer
Breaking: YAML config `ip_view_expiration` is renamed `view_expiration`
Breaking: Views are taken into account after 10 seconds instead of 30
seconds (can be changed in YAML config)

Purpose of this commit is to get closer to other video platforms where
some platforms count views on play (mux, vimeo) or others use a very low
delay (instagram, tiktok)

We also want to improve the viewer identification, where we no longer
use the IP but the `sessionId` generated by the web browser. Multiple
viewers behind a NAT can now be able to be identified as independent
viewers (this method is also used by vimeo or mux)
2024-04-04 16:27:40 +02:00
Chocobozzz
7816fa4d48
Fix lint 2024-04-03 16:16:06 +02:00
Chocobozzz
61fec4e4ef
Better seconds to time formatting 2024-04-03 14:50:30 +02:00
Chocobozzz
11521f231f
Generate small versions of banners too 2024-03-27 15:08:09 +01:00
Chocobozzz
ca889dbbb8
Ensure time to int returns an integer 2024-03-27 09:04:34 +01:00
Chocobozzz
a246c44504
Support tr locale 2024-03-26 14:25:12 +01:00
Chocobozzz
b6b1aaa56f
Add video aspect ratio in server 2024-02-27 15:24:34 +01:00
Chocobozzz
8573e5a80a Implement user import/export in server 2024-02-21 13:49:08 +01:00
Lety Does Stuff
c4b039886e
Fix the escapeAttribute function using HTML entities instead of backslash escapes (#6206)
* Fix the escapeAttribute function using HTML entities instead of backslash escapes

* Fix tests

---------

Co-authored-by: Chocobozzz <me@florianbigard.com>
2024-02-15 14:39:59 +01:00
Chocobozzz
3608eb4f1e
Fix input mask with 10h+ videos 2024-01-03 11:10:41 +01:00
Chocobozzz
ea685879bb
Fix time to int parsing 2023-12-15 09:54:08 +01:00
Chocobozzz
edc695263f
Escape quotes for html attributes 2023-12-14 11:33:08 +01:00
Chocobozzz
f9e710e7d4
Fix chapters import 2023-11-29 14:12:13 +01:00
Chocobozzz
ad801093b9
Simplify for loop 2023-10-30 11:17:46 +01:00
Chocobozzz
4fa78cda92
Fix timetoint
01:02 was translated to 01h02m instead of 01m02s
2023-10-30 10:20:25 +01:00
Chocobozzz
6495764268
Fix chapters extract 2023-08-30 19:24:01 +02:00
Chocobozzz
77b70702d2
Add video chapters support 2023-08-28 16:17:31 +02:00
Chocobozzz
3a4992633e
Migrate server to ESM
Sorry for the very big commit that may lead to git log issues and merge
conflicts, but it's a major step forward:

 * Server can be faster at startup because imports() are async and we can
   easily lazy import big modules
 * Angular doesn't seem to support ES import (with .js extension), so we
   had to correctly organize peertube into a monorepo:
    * Use yarn workspace feature
    * Use typescript reference projects for dependencies
    * Shared projects have been moved into "packages", each one is now a
      node module (with a dedicated package.json/tsconfig.json)
    * server/tools have been moved into apps/ and is now a dedicated app
      bundled and published on NPM so users don't have to build peertube
      cli tools manually
    * server/tests have been moved into packages/ so we don't compile
      them every time we want to run the server
 * Use isolatedModule option:
   * Had to move from const enum to const
     (https://www.typescriptlang.org/docs/handbook/enums.html#objects-vs-enums)
   * Had to explictely specify "type" imports when used in decorators
 * Prefer tsx (that uses esbuild under the hood) instead of ts-node to
   load typescript files (tests with mocha or scripts):
     * To reduce test complexity as esbuild doesn't support decorator
       metadata, we only test server files that do not import server
       models
     * We still build tests files into js files for a faster CI
 * Remove unmaintained peertube CLI import script
 * Removed some barrels to speed up execution (less imports)
2023-08-11 15:02:33 +02:00