基于Whisper神经网络的半自动歌词打轴助手（改进中）

杂谈

0.前言

听歌是我享受人生乐趣的一大方式

然而，当听到一首没有歌词的歌时，我常常会因为没有歌词而感到困扰（恼怒bushi）

于是，我选择进行自制LRC滚动歌词，人力打轴。

.

此时，新问题又出现了：

致力于准确度与听歌体验的提升，我们人工打轴校对时需要花费非常多的时间与精力∠(°ゝ°)

在如今快节奏的生活中，这显然让人感到不适应。

.

随着科技、人工智能与神经网络的飞速发展，我们可以让AI代替我们为歌词打轴！

1.阿巴阿巴？

阿巴阿巴写不下去了，改天诌上点，

就是用Buzz内的OpenAI的离线Whisper模型先识别成SRT字幕，然后写了一个脚本将SRT转化为LRC

.

Buzz安装包：
https://pan.baidu.com/s/1ZPjxF7_h9SGVd-WKHsOVxw?pwd=lawa

.

环境需求：Python环境、OpenCC库

import re
import os
from opencc import OpenCC
import tkinter as tk
from tkinter import filedialog
import ctypes

def convert_srt_to_custom_format(srt_file_path):
    # 初始化    繁-->简
    cc = OpenCC('t2s')

    with open(srt_file_path, 'r', encoding='utf-8') as file:
        lines = file.readlines()

    output_lines = [
        "[ti:]", 
        "[ar:]", 
        "[al:]", 
        "[by:RQvan]", 
        "[00:00.000]"
    ]
    #RQvan那个地方改成自己的名字awa

    for i in range(0, len(lines), 4):
        if len(lines[i+1:i+3]) < 2:
            continue
        timecode = lines[i+1].strip()
        text = lines[i+2].strip()

        # 繁-->简
        text = cc.convert(text)
        
        # 时间转换
        start_time, _ = timecode.split(' --> ')
        start_time = re.sub(r',', '.', start_time)
        start_time = re.sub(r'\d{2}:(\d{2}):(\d{2})\.(\d{3})', r'[\1:\2.\3]', start_time)
        output_lines.append(f"{start_time}{text}")
    
    # 保存为lrc
    output_file_path = os.path.splitext(srt_file_path)[0] + '.lrc'
    with open(output_file_path, 'w', encoding='ansi') as file:
        for line in output_lines:
            file.write(line + '\n')
            
    '''
    # 保存为txt
    output_file_path = os.path.splitext(srt_file_path)[0] + '.txt'
    with open(output_file_path, 'w', encoding='utf-8') as file:
        for line in output_lines:
            file.write(line + '\n')
    '''
    print(f"转换完成，已保存至: {output_file_path}")

def process_path(input_path):
    if os.path.isfile(input_path) and input_path.endswith('.srt'):
        # 输入是单个 SRT 文件
        convert_srt_to_custom_format(input_path)
    elif os.path.isdir(input_path):
        # 输入是一个文件夹
        for root, dirs, files in os.walk(input_path):
            for file in files:
                if file.endswith('.srt'):
                    srt_file_path = os.path.join(root, file)
                    convert_srt_to_custom_format(srt_file_path)
    else:
        print("请输入有效的SRT文件或文件夹路径。")

def select_path():
    root = tk.Tk()
    root.withdraw()
    path = filedialog.askdirectory(title="请选择SRT文件夹或SRT文件路径")
    if not path:
        path = filedialog.askopenfilename(filetypes=[("SRT files", "*.srt")], title="请选择SRT文件")
    return path

def main():
    input_path = select_path()
    if input_path:
        process_path(input_path)
    else:
        print("未选择有效路径。")
    
    # 弹出消息框
    ctypes.windll.user32.MessageBoxW(0, "转换完成！\n按任意键关闭。", "提示", 0)

    # 等待用户按下任意键后关闭窗口
    input("按任意键关闭窗口...")

if __name__ == "__main__":
    main()

.

2.仍存在的问题

然后现在最大的问题就是Buzz的精度问题...希望能早日找到精度较高的免费劳动力

次要的问题还是人工校对，相对地，这次我们只需要替换歌词即可

.

3.没有3

如果觉得文章对你有用，请随意赞赏

基于Whisper神经网络的半自动歌词打轴助手（改进中）

http://rqvan.top/archives/Buzz%26Whisper

作者

RQvan

发布于

2024-08-16

更新于

2024-08-16

许可协议

CC BY 4.0

基于Whisper神经网络的半自动歌词打轴助手（改进中）

0.前言

1.阿巴阿巴？

2.仍存在的问题

3.没有3

作者

发布于

更新于

许可协议

评论