AINewsCollector/HUGGINGFACE_API_ISSUE.md
bojunc 954fe80fa7 fix: 修复 arXiv 和 HuggingFace 采集
- arXiv: 改用 curl 子进程支持代理,使用 lastUpdatedDate 排序 + 代码过滤 48 小时内论文
- HuggingFace: 修正 API 端点为 /api/daily_papers(下划线)
- 优化 HTTP 请求封装,稳定支持代理环境
2026-02-27 23:33:49 +08:00

84 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HuggingFace API问题分析
## 问题诊断
**错误信息:**
```
HTTP/2 401
x-error-message: Invalid username or password.
{"error":"Invalid username or password."}
```
**原因:**
HuggingFace的 `/api/daily-papers` 端点需要认证才能访问。
## 解决方案
### 方案1获取HuggingFace API Token推荐
1. **注册/登录HuggingFace账号**
- 访问https://huggingface.co/join
- 或登录https://huggingface.co/login
2. **获取Access Token**
- 访问https://huggingface.co/settings/tokens
- 点击 "Create new token"
- 选择 "Read" 权限
- 复制生成的token
3. **配置到采集脚本**
- 在collect.js中添加认证头
```javascript
const headers = {
'Authorization': `Bearer ${process.env.HF_TOKEN || 'YOUR_TOKEN_HERE'}`
};
```
4. **设置环境变量**
```bash
export HF_TOKEN="hf_xxxxxxxxxxxx"
```
### 方案2使用HuggingFace Hub库Python
```python
from huggingface_hub import HfApi
api = HfApi()
papers = api.list_papers(limit=30)
```
### 方案3暂时禁用HuggingFace源
在 `config.json` 中:
```json
{
"sources": {
"huggingface": {
"enabled": false
}
}
}
```
## API文档
- 官方API文档https://huggingface.co/docs/hub/api
- OpenAPI规范https://huggingface.co/.well-known/openapi.json
- 速率限制所有API调用都受速率限制
## 当前状态
- ✅ GitHub Trending正常工作
- ❌ HuggingFace Papers需要认证
- ⚠️ arXiv需要检查
## 建议
1. **短期**暂时禁用HuggingFace源只使用GitHub
2. **长期**注册HuggingFace账号并获取token启用认证访问
---
*诊断时间: 2026-02-25*