摄影 / 电影感 / 写实场景
泄露的 AI 基准测试报告照片
生成一张逼真的电脑屏幕照片,屏幕上显示着包含柱状图和详细性能表格的学术技术报告。
- ID
- 13482
- 作者
- Anneshu Nag
- 标签
- 摄影 / 电影感 / 写实场景 / Logo / 品牌 / 视觉识别 / 3D / 游戏 / 像素 / 等距
中文提示词
{
"type": "显示学术技术报告的电脑显示器照片",
"style": "轻微倾斜的屏幕照片,可见摩尔纹,LCD 像素网格,轻微反光,LaTeX 文档排版,衬线字体",
"document_header": {
"left": "4 基准评估",
"right": "{argument name=\"report title\" default=\"DeepSeek-V4 技术报告\"}"
},
"introductory_text": "总结 {argument name=\"main model name\" default=\"DeepSeek-V4\"} 对比 {argument name=\"competitor model 1\" default=\"GPT-5.3\"}、{argument name=\"competitor model 2\" default=\"Claude Opus 4.6\"} 以及 {argument name=\"competitor model 3\" default=\"Gemini 3.1 Pro Preview\"} 的综合评估段落。",
"visualizations": {
"legend": "5 个带有颜色代码的项目:深蓝色、灰色、浅灰色、蓝色条纹、浅蓝色",
"bar_charts": {
"count": 6,
"labels": [
"MMLU-Pro (EM)",
"GPQA-Diamond (Pass@1)",
"AIME 2025 (Pass@1)",
"LiveCodeBench (Pass@1-COT)",
"SWE-bench Verified (Resolved)",
"Tau-bench (Average)"
]
},
"caption": "图 1 | 核心基准测试性能对比。DeepSeek-V4 在大多数基准测试中均取得了业界领先的成果。"
},
"data_table": {
"columns": [
"基准测试",
"{argument name=\"main model name\" default=\"DeepSeek-V4\"}",
"{argument name=\"competitor model 1\" default=\"GPT-5.3\"}",
"{argument name=\"competitor model 2\" default=\"Claude Opus 4.6\"}",
"{argument name=\"competitor model 3\" default=\"Gemini 3.1 Pro Preview\"}",
"GPT-4.1"
],
"categories": {
"count": 4,
"rows": [
{"label": "通用", "icon": "globe/network", "sub_items": 3},
{"label": "推理与数学", "icon": "calculator/clipboard", "sub_items": 3},
{"label": "代码", "icon": "code brackets", "sub_items": 3},
{"label": "Agent", "icon": "robot face", "sub_items": 3}
]
}
}
}
原始提示词
{
"type": "photograph of a computer monitor displaying an academic technical report",
"style": "slightly angled screen photo, visible moire pattern, LCD pixel grid, slight glare, LaTeX document formatting, serif fonts",
"document_header": {
"left": "4 Benchmark Evaluation",
"right": "{argument name=\"report title\" default=\"DeepSeek-V4 Technical Report\"}"
},
"introductory_text": "Paragraph summarizing comprehensive evaluation of {argument name=\"main model name\" default=\"DeepSeek-V4\"} against {argument name=\"competitor model 1\" default=\"GPT-5.3\"}, {argument name=\"competitor model 2\" default=\"Claude Opus 4.6\"}, and {argument name=\"competitor model 3\" default=\"Gemini 3.1 Pro Preview\"}.",
"visualizations": {
"legend": "5 items with color codes: dark blue, grey, light grey, blue striped, light blue",
"bar_charts": {
"count": 6,
"labels": [
"MMLU-Pro (EM)",
"GPQA-Diamond (Pass@1)",
"AIME 2025 (Pass@1)",
"LiveCodeBench (Pass@1-COT)",
"SWE-bench Verified (Resolved)",
"Tau-bench (Average)"
]
},
"caption": "Figure 1 | Performance comparison on core benchmarks. DeepSeek-V4 achieves state-of-the-art results across the majority of benchmarks."
},
"data_table": {
"columns": [
"Benchmark",
"{argument name=\"main model name\" default=\"DeepSeek-V4\"}",
"{argument name=\"competitor model 1\" default=\"GPT-5.3\"}",
"{argument name=\"competitor model 2\" default=\"Claude Opus 4.6\"}",
"{argument name=\"competitor model 3\" default=\"Gemini 3.1 Pro Preview\"}",
"GPT-4.1"
],
"categories": {
"count": 4,
"rows": [
{"label": "General", "icon": "globe/network", "sub_items": 3},
{"label": "Reasoning & Math", "icon": "calculator/clipboard", "sub_items": 3},
{"label": "Code", "icon": "code brackets", "sub_items": 3},
{"label": "Agent", "icon": "robot face", "sub_items": 3}
]
}
}
}