【自荐】PDFMathTranslate - 完整保留排版的 PDF 全文翻译器

我想用硅基流动的api然后使用如下命令直接报错:set SILICON_API_KEY=sk-tzzeknpesbtiuxcwggyzkkgijqfmveje(key我删了一部分)

set SILICON_MODEL=Qwen/Qwen2.5-7B-Instruct
pdf2zh example.pdf -s silicon

以下是报错内容

Traceback (most recent call last):
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 700, in urlopen
    self._prepare_proxy(conn)
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 996, in _prepare_proxy
    conn.connect()
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\connection.py", line 414, in connect
    self.sock = ssl_wrap_socket(
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "D:\deeplearning\anaconda3\lib\ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "D:\deeplearning\anaconda3\lib\ssl.py", line 1071, in _create
    self.do_handshake()
  File "D:\deeplearning\anaconda3\lib\ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
TimeoutError: _ssl.c:980: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\deeplearning\anaconda3\lib\site-packages\requests\adapters.py", line 489, in send
    resp = conn.urlopen(
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "D:\deeplearning\anaconda3\lib\site-packages\urllib3\util\retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /wybxc/DocLayout-YOLO-DocStructBench-onnx/resolve/main/doclayout_yolo_docstructbench_imgsz1024.onnx (Caused by ProxyError('Cannot connect to proxy.', TimeoutError('_ssl.c:980: The handshake operation timed out')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\deeplearning\anaconda3\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "D:\deeplearning\anaconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\deeplearning\anaconda3\Scripts\pdf2zh.exe\__main__.py", line 4, in <module>
  File "D:\deeplearning\anaconda3\lib\site-packages\pdf2zh\__init__.py", line 2, in <module>
    from pdf2zh.high_level import translate, translate_stream
  File "D:\deeplearning\anaconda3\lib\site-packages\pdf2zh\high_level.py", line 24, in <module>
    model = DocLayoutModel.load_available()
  File "D:\deeplearning\anaconda3\lib\site-packages\pdf2zh\doclayout.py", line 21, in load_available
    return DocLayoutModel.load_onnx()
  File "D:\deeplearning\anaconda3\lib\site-packages\pdf2zh\doclayout.py", line 13, in load_onnx
    model = OnnxModel.from_pretrained(
  File "D:\deeplearning\anaconda3\lib\site-packages\pdf2zh\doclayout.py", line 73, in from_pretrained
    pth = hf_hub_download(repo_id=repo_id, filename=filename, etag_timeout=1)
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 860, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 923, in _hf_hub_download_to_cache_dir
    (url_to_download, etag, commit_hash, expected_size, head_call_error) = _get_metadata_or_catch_error(
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 1374, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 1294, in get_hf_file_metadata
    r = _request_wrapper(
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 278, in _request_wrapper
    response = _request_wrapper(
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 301, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "D:\deeplearning\anaconda3\lib\site-packages\requests\sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\deeplearning\anaconda3\lib\site-packages\requests\sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "D:\deeplearning\anaconda3\lib\site-packages\huggingface_hub\utils\_http.py", line 93, in send
    return super().send(request, *args, **kwargs)
  File "D:\deeplearning\anaconda3\lib\site-packages\requests\adapters.py", line 559, in send
    raise ProxyError(e, request=request)
requests.exceptions.ProxyError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /wybxc/DocLayout-YOLO-DocStructBench-onnx/resolve/main/doclayout_yolo_docstructbench_imgsz1024.onnx (Caused by ProxyError('Cannot connect to proxy.', TimeoutError('_ssl.c:980: The handshake operation timed out')))"), '(Request ID: 140bbb81-d3a4-4400-a45b-4f07eed5c4b0)')
set OPENAI_BASE_URL=https://api.deepseek.com
set OPENAI_API_KEY=sk-yyybbbxxxx
set OPENAI_MODEL=deepseek-chat

用了非常棒。谢谢作者:)

按照这几个指令执行之后,执行pdf2zh example.pdf -s deepseek后报错如图ValueError: Unsupported translation service

我告诉你这种方式是通过 OPENAI 格式调用

pdf2zh example.pdf -s openai

我用的早,之前不支持单独设置,现在可以了,具体参考PDFMathTranslate/docs/ADVANCED.md at main · Byaidu/PDFMathTranslate · GitHub

set DEEPSEEK_API_KEY=
set DEEPSEEK_MODEL=

本地部署的setup.bat 点击以后没有链接怎么办

才发现这么好的一个项目!已经在教研室局域网内部署上了。原来作者也来小众软件了呀,而且很有耐心有问必答,请问哪里有捐赠渠道呀,一杯咖啡聊表心意。

作者好,想问一下用pip安装完成后运行pdf2zh -i 提示 pdf2zh : 无法将“pdf2zh”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写,如果包括路径,请确保路径正确,然后再试一次。是什么原因啊