一文詳解基于NarrotoAI的短劇短視頻自動解說、混剪AI平臺搭建

背景

前陣給孩子做電子相冊學了點剪輯技術，就想湊個熱鬧剪剪短劇玩玩，一是學以致用，再者也好奇短劇創作為啥這么火，跟個風。

初步了解情況后，發現我的剪輯技術已經落后了，行家們玩的主要是解說，并且剪輯和解說AI也都會了。真的這么牛嗎，最近我也一直在關注各種AI工具，因此花了兩個晚上的時間研究體驗了一下AI剪輯視頻。本著能用免費技術決不充值的原則，在對比了各種AI工具后，選擇了NarrotoAI這款體驗了一下。體驗之后的結果是，我感覺我又行了。

申明：本文純技術貼。

短劇剪輯不僅是內容創作的熱門領域，更是學習AI技術的絕佳實踐場景。通過將AI工具深度融入短劇制作的各個環節（如解說臺詞生成、腳本生成，自動剪輯等），創作者可在完成作品的同時，系統掌握前沿技術，文中涉及大量AI相關工具，全部可以免費獲得 。

如果您在復刻過程中遇到問題，請關注并留言交流。

環境要求

環境就用現成的，我手上就有一臺huawei matebook pro 筆記本，操作系統win11。

硬件配置：
- CPU：i7-1260P（12核16線程，4.7GHz睿頻）可高效處理視頻解析與剪輯任務58。
- GPU：Iris Xe顯卡（96 EU）支持視頻編解碼加速，需安裝最新驅動以啟用硬件加速86。Intel? Iris? Xe Graphics
軟件工具：
- Python 3.8+：推薦使用Anaconda管理環境。
- FFmpeg：用于視頻處理，需添加到系統環境變量。
- Git：克隆代碼倉庫。

Anaconda安裝

已有公眾號文章《人工智能學習必備工具之-Anaconda3安裝、配置及優化》https://mp.weixin.qq.com/s/karflR2eWIOmb4NcMIrD3Q進行了詳細介紹。

視頻工具

安裝 ImageMagick

ImageMagick 是一款功能強大的開源圖像處理軟件套件。具體介紹如下:

特點
- 跨平臺性：可在 Linux、Windows、Mac OS 等大多數非專有的操作系統上運行。
- 免費開源：遵守 GPL 許可協議，全部源碼開放，可自由使用、復制、修改和發布。
- 語言支持廣泛：支持 Perl、C、C++、Python、PHP、Ruby、Java 等編程語言，并提供了相應的接口。
功能
- 格式轉換：能將圖像在超過 200 種格式之間相互轉換，如常見的 JPEG、PNG、GIF、TIFF，以及特殊的 RAW、SVG 等格式。
- 基本變換：可以對圖像進行改變尺寸、旋轉、裁剪、翻轉、修剪等操作。
- 特效處理：具備模糊、銳化、閾值處理、色彩調整等特效功能。
- 動畫制作：能夠將一組圖片制作成 GIF 動畫。
- 圖像合成與編輯：可將幾張圖片合成為一張組合圖片，還能在圖片上添加文字、圖形，為圖片加邊框或框架等。
應用場景
- Web 開發：自動生成縮略圖、優化圖像格式，提升網頁加載速度。
- 電商平臺：批量處理商品圖像，如裁剪、加水印、統一格式等。
- 數據分析與機器學習：對數據集中的圖像進行預處理，如調整大小、去噪等。
- 個人項目：批量整理圖片、生成相冊或進行日常圖像處理。

ImageMagick下載地址：https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-40-Q16-x64-static.exe

安裝之后，打開 cmd驗證一下能否正常運行：

FFmpeg安裝

ffmpeg官方：https://www.ffmpeg.org/download.html。

安裝之后驗證一下：
在這里插入圖片描述

安裝NarratoAI

介紹

支持阿里QwenVL大模型，國內網絡可用；支持短劇混剪功能，十分鐘精彩不斷；新增一鍵合并視頻和字幕。

支持阿里QwenVL大模型，國內網絡可用
這次升級有了QwenVL大模型的視頻理解能力，而且國內網絡就能用，還有免費額度哦。
支持短劇混剪功能，十分鐘精彩不斷
工具現在支持短劇混剪，最長支持解析 10 分鐘的視頻。
優化時間戳到毫秒級，剪輯超精準
時間戳精確到毫秒了，這對剪輯特別有用。
新增一鍵合并視頻和字幕，素材整理快人一步
新增的合并視頻和字幕功能很方便。
腳本上傳，創作按部就班
有了腳本上傳功能
一鍵清理緩存，工具運行超流暢
要是工具用久了有點卡，別擔心。
一鍵轉錄超便捷，文字提取超輕松
這個一鍵轉錄功能超實用。
支持 TTS生成失敗支持自動重試

NarratoAI下載代碼

# 克隆代碼倉庫
$ git clone https://github.com/linyqh/NarratoAI.git
Cloning into 'NarratoAI'...
remote: Enumerating objects: 1198, done.
remote: Counting objects: 100% (290/290), done.
remote: Compressing objects: 100% (131/131), done.
remote: Total 1198 (delta 178), reused 161 (delta 159), pack-reused 908 (from 2)
Receiving objects: 100% (1198/1198), 7.30 MiB | 3.15 MiB/s, done.
Resolving deltas: 100% (759/759), done.

配置python虛擬環境

直接使用pip安裝依賴包，會出一些安裝錯誤，解決起來比較費時間，推薦使用Anaconda3環境

# 創建 python 虛擬環境
conda create -n env_name python=3.10 -y
conda activate env_name
python -V#執行結果如下： 
(base) C:\Users\seane>conda activate env_name
(env_name) C:\Users\seane>python -V
Python 3.10.16(env_name) C:\Users\seane>

安裝NarratoAI依賴庫

#進入前面下載的代碼目錄 
cd NarratoAI# 安裝依賴，如果出錯，請參考后文中問題總結 
pip install -r requirements.txt# 安裝 pytorch (無 GPU 的電腦可選)
pip3 install torch torchvision torchaudio

注：針對于intel集成顯卡的（huawei matebook pro 2022)的pytorch的安裝將在后續的文章中詳細介紹。pytorch是一個python的AI工具套件，基于intel集成顯卡的pytorch在性能上要優于基于cpu的版本。

安裝成后窗口顯示如下：

Successfully installed aiohappyeyeballs-2.4.6 aiohttp-3.10.11 aiosignal-1.3.2 altair-5.5.0 anyio-4.8.0 appdirs-1.4.4 attrs-25.1.0 av-12.3.0 azure-cognitiveservices-speech-1.37.0 blinker-1.9.0 brotli-1.1.0 cachetools-5.5.2 certifi-2025.1.31 chardet-5.2.0 charset-normalizer-3.4.1 click-8.1.8 colorama-0.4.6 coloredlogs-15.0.1 ctranslate2-4.5.0 dashscope-1.15.0 decorator-4.4.2 distro-1.9.0 edge-tts-6.1.19 fastapi-0.115.8 faster-whisper-1.0.3 flatbuffers-25.2.10 frozenlist-1.5.0 g4f-0.3.0.10 git-changelog-2.5.3 gitdb-4.0.12 gitpython-3.1.44 google-ai-generativelanguage-0.6.15 google-api-core-2.24.1 google-api-python-client-2.161.0 google-auth-2.38.0 google-auth-httplib2-0.2.0 google.generativeai-0.8.4 googleapis-common-protos-1.68.0 grpcio-1.70.0 grpcio-status-1.70.0 h11-0.14.0 httpcore-1.0.7 httplib2-0.22.0 httpx-0.27.2 huggingface-hub-0.29.1 humanfriendly-10.0 idna-3.10 imageio-2.37.0 imageio_ffmpeg-0.6.0 jiter-0.8.2 joblib-1.4.2 jsonschema-4.23.0 jsonschema-specifications-2024.10.1 loguru-0.7.3 markdown-it-py-3.0.0 mdurl-0.1.2 moviepy-2.0.0.dev2 multidict-6.1.0 narwhals-1.27.1 onnxruntime-1.20.1 openai-1.53.1 opencv-python-4.10.0.84 pandas-2.2.3 pillow-10.3.0 proglog-0.1.10 propcache-0.3.0 proto-plus-1.26.0 protobuf-5.29.3 pyarrow-19.0.1 pyasn1-0.6.1 pyasn1-modules-0.4.1 pycryptodome-3.21.0 pydantic-2.6.4 pydantic-core-2.16.3 pydeck-0.9.1 pydub-0.25.1 pygments-2.19.1 pyparsing-3.2.1 pyreadline3-3.5.4 pysrt-1.1.2 python-dateutil-2.9.0.post0 python-dotenv-1.0.1 python-multipart-0.0.20 pytz-2025.1 pyyaml-6.0.2 redis-5.0.3 referencing-0.36.2 regex-2024.11.6 requests-2.31.0 rich-13.9.4 rpds-py-0.23.1 rsa-4.9 safetensors-0.5.2 scikit-learn-1.5.2 scipy-1.15.2 semver-3.0.4 six-1.17.0 smmap-5.0.2 sniffio-1.3.1 starlette-0.45.3 streamlit-1.40.2 tenacity-9.0.0 threadpoolctl-3.5.0 tiktoken-0.8.0 tokenizers-0.21.0 toml-0.10.2 tomli-2.0.2 tornado-6.4.2 tqdm-4.67.1 transformers-4.47.0 tzdata-2025.1 uritemplate-4.1.1 urllib3-2.2.3 uvicorn-0.27.1 watchdog-5.0.2 win32-setctime-1.2.0 yarl-1.18.3 yt-dlp-2024.11.18

AI模型準備

語音模型Whisper

Whisper 模型用于生成字幕，轉錄視頻，CPU和GPU均可運行，默認CPU；GPU會更快

下載地址：https://huggingface.co/guillaumekln/faster-whisper-large-v2
解壓到narotoAI目： NarratoAI/app/models 。

預訓練語言模型bert

下載 bert 模型: github.com
解壓文件

視頻理解模型（Gemini)

Gemini是用于視頻理解的大模型 ,是一個在線推薦模型，可通過api訪問，需要申請 api key。

注：如果無法訪問google網站，可跳轉到下一個章節，使用基于qwen的視頻模型。

訪問Google AI Studio申請API Key。
注冊并登陸： Get api key，并保存，留待后面配置使用。

視頻理解模型（QwenVL）

登陸https://bailian.console.aliyun.com/，注冊賬號
實名認證賬號中心
開通模型
選擇 VL-max latest
申請api key 并保存，留待后面配置使用。

運行及配置修改

首次運行生成配置文件

**注：先配置python虛擬環境后再運行工具。**激活方法參考前面章節。

首次運行NarratoAI后，會在NarratoAI目錄下生成配置文件：config.toml。

(pytorch251) d:\code\NarratoAI>streamlit run webui.pyWelcome to Streamlit!If you’d like to receive helpful onboarding emails, news, offers, promotions,and the occasional swag, please enter your email address below. Otherwise,leave this field blank.Email:You can find our privacy policy at https://streamlit.io/privacy-policySummary:- This open source library collects usage statistics.- We cannot see and do not store information contained inside Streamlit apps,such as text, charts, images, etc.- Telemetry data is stored in servers in the United States.- If you'd like to opt out, add the following to %userprofile%/.streamlit/config.toml,creating that file if necessary:[browser]gatherUsageStats = falseYou can now view your Streamlit app in your browser.Local URL: http://localhost:8501Network URL: http://192.168.1.14:85012025-02-23 15:45:16.758 | INFO     | app.config.config:load_config:20 - copy config.example.toml to config.toml
2025-02-23 15:45:16.774 | INFO     | app.config.config:load_config:22 - load config from file: D:\code\NarratoAI/config.toml
2025-02-23 15:45:16.803 | INFO     | app.config.config:<module>:71 - NarratoAI v0.3.9
2025-02-23 15:46:01 | INFO | "./app\utils\utils.py:589": init_resources - 已復制系統字體: simhei.ttf
2025-02-23 15:46:02.242 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path', but it does not exist! Ensure that it is registered via torch::class_

基本配置

Gemini API Key（vision_gemini_api_key）

將申請到API Key填入項目根目錄的config.toml文件：

    ########## Vision Gemini API Keyvision_gemini_api_key = "xxxx"

text_openai_api_key

生成文案的大模型 API Key；建議不要再使用 Gemini 模型生成文案。國內有很多免費好用的api接口可用。比如火山方舟的doubao, Deepseek等。

ImageMagick路徑指定

ffmpeg路徑指定

ffmpeg_path = "D:\\AppGallery\\bin\\ffmpeg.exe"

proxy.http 和 proxy.https

配置字體和BGM

下載字體文件

下載地址：https://zenodo.org/records/13293144/files/STHeitiMedium.ttc

放置目錄： NarratoAI\resource\fonts

下載BGM文件
1. 下載地址：https://zenodo.org/records/13293150/files/output000.mp3
2. 放置目錄： NarratoAI\resource\songs

再次運行

修改完配置文件后，再次啟動NarratoAI

(pytorch251) d:\code\NarratoAI>streamlit run webui.py

瀏覽器訪問：

http://localhost:8501/

QwenVL 模型配置（國內使用強烈推薦）

如果Gemini模型無法訪問，建議切換到QwenVL模型。

將之前申請的api key配置到 NarratoAI web界面中，
測試連接，顯示 QwenVL 模型可用

視頻剪輯實例

視頻分析實例一:生成解說腳本

選擇腳本模板，并上傳視頻文件，文件小于200M, 大文件可按提示保存到NarratoAI\resource\videos

上傳一個視頻后，點擊AI生成畫面解說腳本。

[{"timestamp": "00:00:38,500-00:00:38,500","picture": "畫面中顯示兩個人在一個室內環境中。左邊的人背對著鏡頭，穿著深色上衣和紅色褲子，頭發扎成馬尾。右邊的人面向鏡頭，穿著同樣的深色上衣和紅色褲子，雙手張開，似乎在做某種動作或表演。背景是一扇白色的門和淺藍色的墻壁，地板是灰色的瓷磚。畫面上方有紅色的文字“來到你的面前”，右下角有抖音的標志和一些文字信息。","narration": "倆同款著裝室內整活","OST": 2,"new_timestamp": "00:00:00,000-00:00:00,000"}
]

后臺信息：

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 1579: illegal multibyte sequence
2025-02-23 22:58:51 | INFO | "./app\utils\video_processor_v2.py:312": process_video_pipeline -
步驟2: 從壓縮視頻提取關鍵幀...
2025-02-23 22:58:51 | INFO | "./app\utils\video_processor_v2.py:234": process_video - 讀取視頻幀...
讀取視頻: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7319/7319 [00:06<00:00, 1049.69it/s]
2025-02-23 22:58:58 | INFO | "./app\utils\video_processor_v2.py:246": process_video - 檢測場景邊界...█████████████████████████████████████████████████████████████████▍| 7289/7319 [00:06<00:00, 373.21it/s]
2025-02-23 22:59:08 | INFO | "./app\utils\video_processor_v2.py:248": process_video - 檢測到 1 個場景邊界2025-02-23 22:59:17 | INFO | "./app\utils\video_processor_v2.py:290": process_video_pipeline - 步驟1: 壓縮視頻...                                                                      | 0/1 [00:00<?, ?it/s]
Exception in thread Thread-69 (_readerthread):
Traceback (most recent call last):File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\threading.py", line 1075, in _bootstrap_innerself.run()File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\threading.py", line 1012, in runself._target(*self._args, **self._kwargs)File "D:\AppGallery\Anaconda3\envs\pytorch251\Lib\subprocess.py", line 1601, in _readerthreadbuffer.append(fh.read())^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa2 in position 1603: illegal multibyte sequence
2025-02-23 22:59:27 | INFO | "./app\utils\video_processor_v2.py:312": process_video_pipeline -
步驟2: 從壓縮視頻提取關鍵幀...
2025-02-23 22:59:28 | INFO | "./app\utils\video_processor_v2.py:234": process_video - 讀取視頻幀...
讀取視頻: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3359/3359 [00:11<00:00, 295.50it/s]
2025-02-23 22:59:39 | INFO | "./app\utils\video_processor_v2.py:246": process_video - 檢測場景邊界...
2025-02-23 22:59:48 | INFO | "./app\utils\video_processor_v2.py:248": process_video - 檢測到 1 個場景邊界████████████████████████████████████████████████████████████▏ | 3319/3359 [00:10<00:00, 456.36it/s]
提取關鍵幀: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:56<00:00, 56.25s/it]
保存壓縮關鍵幀: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16.25it/s]
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:317": process_video_pipeline - ███████████████████████████████████████████████████████████████████████████████| 1/1 [00:56<00:00, 56.23s/it]
步驟3: 提取高清關鍵幀...                                                                                                                                                              | 0/1 [00:00<?, ?it/s]
提取高清幀: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.83it/s]
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:194": extract_frames_by_numbers - 共提取了 1 個不同時間戳的幀
2025-02-23 23:00:47 | INFO | "./app\utils\video_processor_v2.py:325": process_video_pipeline - 處理完成！高清關鍵幀保存在: .\storage\temp\keyframes\37d1b60812267477b6ac5d5c610b737d[00:00<00:00,  2.84it/s]
2025-02-23 23:00:48 | INFO | "./app\utils\video_processor_v2.py:370": process_video_pipeline - 臨時文件已清理
2025-02-23 23:00:48 | DEBUG | "./webui\tools\generate_script_docu.py:106": generate_script_docu - Vision LLM 提供商: qwenvl
2025-02-23 23:00:49 | INFO | "./app\utils\qwenvl_analyzer.py:121": analyze_images - 正在加載圖片...
分析進度: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00,  4.82s/it]
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:162": generate_script_docu - 批次 0 處理完成，共 1 張圖片
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:166": generate_script_docu - 處理時間戳: 00:00:38,500-00:00:38,500███████████████████████████████████████| 1/1 [00:04<00:00,  4.81s/it]
2025-02-23 23:00:56 | DEBUG | "./webui\tools\generate_script_docu.py:211": generate_script_docu - 添加幀內容: 時間范圍=00:00:38,500-00:00:38,500, 分析結果長度=149
2025-02-23 23:00:58 | INFO | "./app\utils\script_generator.py:319": __init__ - 文本 LLM 提供商: ep-20241231163508-gh4jr
2025-02-23 23:00:59 | WARNING | "./app\utils\script_generator.py:101": __init__ - 未找到模型 ep-20241231163508-gh4jr 的專用編碼器，使用默認編碼器
2025-02-23 23:01:23 | DEBUG | "./app\utils\script_generator.py:430": calculate_duration_and_word_count - 時間范圍 00:00:38,500-00:00:38,500 的持續時間為 0.000秒, 估算字數: 10
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:443": process_frames - 時間范圍: 00:00:38,500-00:00:38,500, 建議字數: 10
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:444": process_frames - 倆同款著裝室內整活
2025-02-23 23:01:24 | INFO | "./app\utils\script_generator.py:514": _save_results - 保存腳本成功，總時長: 00:00:00,000
2025-02-23 23:01:24 | INFO | "./webui\tools\generate_script_docu.py:253": generate_script_docu - 腳本生成完成
2025-02-23 23:01:39.776 Examining the path of torch.classes raised: Tried to instantiate class '__path__._path',

視頻自動解說案例二:短劇剪輯

跟上一節操作方法一樣，上傳視頻，生成解說腳本，以后按照以下順序，依次“腳本格式檢查” -> “保存腳本” -> “裁剪視頻” -> “生成視頻”

裁剪視頻成功：

生成視頻成功：

處理過程日志

2025-02-24 10:04:55.761 | INFO     | __main__:render_generate_button:128 - 開始生成視頻2025-02-24 10:04:55.761 | INFO     | app.services.task:start_subclip:162 - ## 開始任務: 6d2dfc04-0de9-4fb3-aedb-6e200c49cead2025-02-24 10:04:55.796 | INFO     | app.services.task:start_subclip:173 - ## 1. 加載視頻腳本2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:185 - 解說完整腳本: 
瞧瞧這幾人各懷心思，圍繞贖金展開拉扯啦 瞧這室內幾人著裝各異 拉扯還在繼續 瞧這室內拉扯 各懷心思忙 看這室內眾人 神色各有千秋 看這室內眾人 神色各有千秋
黑皮女后綠裝男 業務網游挖幣 眾人著裝神態超有趣 眾人著裝神態妙后續更逗 眾人著裝神態妙后續更逗，且看這幾人要弄啥幺蛾子 深色西裝男先亮相，倆嚴肅女登場又有啥戲？ 深綠西裝男室內要干啥？2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:186 - 解說 OST 列表: 
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2]2025-02-24 10:04:55.825 | DEBUG    | app.services.task:start_subclip:187 - 解說時間戳列表: 
['00:00:03,240-00:00:12,160', '00:00:14,000-00:00:22,079', '00:00:23,120-00:00:29,079', '00:00:30,440-00:00:36,960', '00:00:38,200-00:00:47,920', '00:00:48,759-00:00:52,679', '00:00:54,039-00:01:00,240', '00:01:01,840-00:01:11,079', '00:01:12,640-00:01:22,239', '00:01:24,319-00:01:26,719']2025-02-24 10:04:55.839 | INFO     | app.services.task:start_subclip:201 - ## 2. 根據OST設置生成音頻列表2025-02-24 10:04:55.856 | DEBUG    | app.services.task:start_subclip:207 - 需要生成TTS的片段數: 102025-02-24 10:04:55.877 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:04:57.453 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_00,000-00_00_08,919.mp32025-02-24 10:04:57.453 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_00,000-00_00_08,919.mp32025-02-24 10:04:57.453 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:04:59.108 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_08,919-00_00_16,999.mp32025-02-24 10:04:59.108 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_08,919-00_00_16,999.mp32025-02-24 10:04:59.108 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:10.965 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_16,999-00_00_22,958.mp32025-02-24 10:05:10.965 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_16,999-00_00_22,958.mp32025-02-24 10:05:10.965 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:12.429 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_22,958-00_00_29,478.mp32025-02-24 10:05:12.445 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_22,958-00_00_29,478.mp32025-02-24 10:05:12.445 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:14.044 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_29,478-00_00_39,198.mp32025-02-24 10:05:14.060 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_29,478-00_00_39,198.mp32025-02-24 10:05:14.060 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:15.609 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_39,198-00_00_43,118.mp32025-02-24 10:05:15.610 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_39,198-00_00_43,118.mp32025-02-24 10:05:15.610 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:17.475 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_43,118-00_00_49,319.mp32025-02-24 10:05:17.486 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_43,118-00_00_49,319.mp32025-02-24 10:05:17.489 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:19.217 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_49,319-00_00_58,557.mp32025-02-24 10:05:19.217 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_49,319-00_00_58,557.mp32025-02-24 10:05:19.217 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:20.749 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_58,557-00_01_08,156.mp32025-02-24 10:05:20.749 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_00_58,557-00_01_08,156.mp32025-02-24 10:05:20.753 | INFO     | app.services.voice:azure_tts_v1:1074 - 第 1 次使用 edge_tts 生成音頻2025-02-24 10:05:22.315 | INFO     | app.services.voice:azure_tts_v1:1109 - completed, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_01_08,156-00_01_10,556.mp32025-02-24 10:05:22.315 | INFO     | app.services.voice:tts_multiple:1375 - 已生成音頻文件: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\audio_00_01_08,156-00_01_10,556.mp32025-02-24 10:05:22.318 | INFO     | app.services.task:start_subclip:228 - 合并音頻文件: ['D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_00,000-00_00_08,919.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_08,919-00_00_16,999.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_16,999-00_00_22,958.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_22,958-00_00_29,478.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_29,478-00_00_39,198.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_39,198-00_00_43,118.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_43,118-00_00_49,319.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_49,319-00_00_58,557.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_00_58,557-00_01_08,156.mp3', 'D:\\code\\NarratoAI\\storage\\tasks\\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\\audio_00_01_08,156-00_01_10,556.mp3']2025-02-24 10:05:27.283 | INFO     | app.services.audio_merger:merge_audio_files:73 - 合并后的音頻文件已保存: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final_audio.mp32025-02-24 10:05:27.283 | INFO     | app.services.task:start_subclip:237 - 音頻文件合并成功2025-02-24 10:05:27.299 | INFO     | app.services.task:start_subclip:263 - ## 3. 生成字幕、提供程序是: faster-whisper-large-v22025-02-24 10:05:27.299 | INFO     | app.services.subtitle:create:69 - 未檢測到 CUDA，使用 CPU 模式2025-02-24 10:05:27.302 | INFO     | app.services.subtitle:create:78 - 使用 CPU 加載模型: ./app/models/faster-whisper-large-v22025-02-24 10:05:38.484 | INFO     | app.services.subtitle:create:86 - 模型加載完成，使用設備: cpu, 計算類型: int82025-02-24 10:05:38.484 | INFO     | app.services.subtitle:create:88 - start, output file: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\subtitle.srt2025-02-24 10:06:00.072 | INFO     | app.services.subtitle:create:101 - 檢測到的語言: 'zh', probability: 1.002025-02-24 10:06:44.076 | DEBUG    | app.services.subtitle:recognized:114 - [0.00s -> 1.82s] 瞧瞧這幾人各懷心思2025-02-24 10:06:44.077 | DEBUG    | app.services.subtitle:recognized:114 - [2.20s -> 3.84s] 圍繞贖金展開拉扯了2025-02-24 10:06:44.078 | DEBUG    | app.services.subtitle:recognized:114 - [9.12s -> 11.00s] 瞧這室內幾人著裝各異2025-02-24 10:06:44.080 | DEBUG    | app.services.subtitle:recognized:114 - [11.36s -> 12.52s] 拉扯還在繼續2025-02-24 10:06:44.081 | DEBUG    | app.services.subtitle:recognized:114 - [17.16s -> 18.36s] 瞧這室內拉扯2025-02-24 10:06:44.082 | DEBUG    | app.services.subtitle:recognized:114 - [18.72s -> 19.70s] 各懷心思忙2025-02-24 10:06:44.084 | DEBUG    | app.services.subtitle:recognized:114 - [23.14s -> 24.28s] 看這室內眾人2025-02-24 10:06:44.085 | DEBUG    | app.services.subtitle:recognized:114 - [24.60s -> 25.86s] 神色各有千秋2025-02-24 10:06:44.086 | DEBUG    | app.services.subtitle:recognized:114 - [29.67s -> 30.77s] 看這室內眾人2025-02-24 10:06:44.088 | DEBUG    | app.services.subtitle:recognized:114 - [31.11s -> 32.35s] 神色各有千秋2025-02-24 10:06:44.089 | DEBUG    | app.services.subtitle:recognized:114 - [33.03s -> 34.45s] 黑皮女后綠妝男2025-02-24 10:06:44.107 | DEBUG    | app.services.subtitle:recognized:114 - [34.81s -> 35.85s] 業務網游挖幣2025-02-24 10:06:44.110 | DEBUG    | app.services.subtitle:recognized:114 - [39.38s -> 41.16s] 眾人著裝神態超有趣2025-02-24 10:06:44.112 | DEBUG    | app.services.subtitle:recognized:114 - [43.31s -> 44.65s] 眾人著裝神態妙2025-02-24 10:06:44.113 | DEBUG    | app.services.subtitle:recognized:114 - [44.91s -> 45.65s] 后續更逗2025-02-24 10:07:28.051 | DEBUG    | app.services.subtitle:recognized:114 - [51.67s -> 54.31s] 且看這幾人要弄啥幺蛾子2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [58.77s -> 60.45s] 深色西裝男先亮相2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [60.87s -> 61.99s] 雅顏素女登場2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [62.19s -> 62.89s] 又有啥戲2025-02-24 10:07:28.052 | DEBUG    | app.services.subtitle:recognized:114 - [68.36s -> 70.42s] 深綠西裝男室內要干啥2025-02-24 10:07:28.052 | INFO     | app.services.subtitle:create:164 - complete, elapsed: 87.97 s2025-02-24 10:07:28.052 | INFO     | app.services.subtitle:create:181 - subtitle file created: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\subtitle.srt2025-02-24 10:07:28.067 | INFO     | app.services.task:start_subclip:277 - ## 4. 裁剪視頻2025-02-24 10:07:28.083 | INFO     | app.services.task:start_subclip:295 - ## 5. 合并視頻: => .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\combined.mp42025-02-24 10:07:28.083 | INFO     | app.services.video:combine_clip_videos:130 - 音頻的最大持續時間: 70.55699999999999 s2025-02-24 10:07:28.491 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-03_240-00-00-12_160.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:28.799 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-14_000-00-00-22_079.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:29.087 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-23_120-00-00-29_079.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:29.588 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-30_440-00-00-36_960.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:30.340 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-38_200-00-00-47_920.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:31.063 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-48_759-00-00-52_679.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:31.767 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-00-54_039-00-01-00_240.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:32.547 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-01_840-00-01-11_079.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:32.819 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-12_640-00-01-22_239.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:33.227 | INFO     | app.services.video:combine_clip_videos:155 - 視頻 .\storage\temp\clip_video\860467a7beea883037e8925244df21a9\vid-00-01-24_319-00-01-26_719.mp4 已調整尺寸為 1080 x 19202025-02-24 10:07:33.243 | INFO     | app.services.video:combine_clip_videos:170 - 開始合并視頻... (過程中出現 UserWarning: 不必理會)2025-02-24 10:08:26.147 | SUCCESS  | app.services.video:combine_clip_videos:184 - 視頻合并完成2025-02-24 10:08:26.147 | INFO     | app.services.task:start_subclip:311 - ## 6. 最后合成: 1 => .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final-1.mp42025-02-24 10:08:26.157 | WARNING  | app.utils.utils:get_bgm_file:153 - 在目錄 .\resource\songs 中沒有找到 MP3 或 FLAC 文件2025-02-24 10:08:26.657 | INFO     | app.services.video:generate_video_v3:331 - 讀取到 20 條字幕2025-02-24 10:08:31.316 | INFO     | app.services.video:generate_video_v3:387 - 成功創建 20 條字幕剪輯2025-02-24 10:08:31.332 | DEBUG    | app.services.video:generate_video_v3:397 - 音量配置: {'original': 0.7, 'bgm': 0.3, 'narration': 1.0}2025-02-24 10:08:31.483 | INFO     | app.services.video:generate_video_v3:429 - 開始導出視頻...
2025-02-24 10:10:58.073 | INFO     | app.services.video:generate_video_v3:436 - 視頻已導出到: .\storage\tasks\6d2dfc04-0de9-4fb3-aedb-6e200c49cead\final-1.mp42025-02-24 10:10:58.079 | SUCCESS  | app.services.task:start_subclip:358 - 任務 6d2dfc04-0de9-4fb3-aedb-6e200c49cead 已完成, 生成 1 個視頻.2025-02-24 10:10:58.321 | INFO     | __main__:render_generate_button:165 - 視頻生成完成2025-02-24 10:10:58.508 | DEBUG    | webui.utils.performance:monitor_memory:12 - Memory usage: 2703.35 MB