Skip to content

Commit d5a9473

Browse files
Tianwei ZhaoTianwei Zhao
authored andcommitted
update
1 parent c35dc8f commit d5a9473

File tree

3 files changed

+121
-148
lines changed

3 files changed

+121
-148
lines changed

README.md

Lines changed: 82 additions & 109 deletions
Original file line numberDiff line numberDiff line change
@@ -1,145 +1,118 @@
1-
# RewardAnything GitHub Pages
1+
# Core Knowledge Deficits in Multi-Modal Language Models
22

3-
This directory contains the GitHub Pages website for the Core Knowledge Deficits in Multi-Modal Language Models project.
3+
**Official website for the ICML 2025 paper submission**
44

5-
## 🏗️ Structure
5+
🌐 **Website**: [https://williamium3000.github.io/core-knowledge](https://williamium3000.github.io/core-knowledge)
6+
📄 **Paper**: [https://arxiv.org/abs/2410.10855](https://arxiv.org/abs/2410.10855)
7+
🤗 **Dataset**: [https://huggingface.co/grow-ai-like-a-child](https://huggingface.co/grow-ai-like-a-child)
68

7-
```
8-
pages/
9-
├── _config.yml # Jekyll configuration
10-
├── _layouts/
11-
│ └── default.html # Main layout template
12-
├── index.html # Homepage content
13-
├── assets/
14-
│ ├── images/ # Logo and image placeholders
15-
│ └── favicon.svg # Site favicon
16-
├── Gemfile # Ruby dependencies
17-
├── setup.sh # Local setup script
18-
└── README.md # This file
19-
```
9+
## 📖 About
2010

21-
## 🚀 Automatic Deployment
11+
This repository contains the official website for our paper "Core Knowledge Deficits in Multi-Modal Language Models". The website presents our comprehensive evaluation of 230 multi-modal language models using the **CoreCognition** benchmark, which assesses 12 foundational cognitive concepts grounded in developmental cognitive science.
2212

23-
1. **Changes are pushed** to the `main` branch in the `pages/` directory
24-
2. **Manual trigger** via GitHub Actions tab
13+
## 🔍 Key Findings
2514

26-
The deployment is handled by the GitHub Actions workflow in `.github/workflows/deploy-pages.yml`.
15+
Our research reveals four critical shortcomings in state-of-the-art Multi-modal Large Language Models (MLLMs):
2716

28-
## 🏠 Local Development
17+
1. **Core Knowledge Deficits**: MLLMs excel at higher-level abilities but struggle with lower-level cognitive abilities
18+
2. **Misaligned Dependency**: Core abilities show weak cross-stage correlations, lacking developmental scaffolding
19+
3. **Predictability**: Performance on core knowledge predicts higher-level abilities
20+
4. **Limited Scaling**: MLLMs show minimal scalability improvements on low-level abilities compared to high-level ones
2921

30-
### Quick Setup
22+
## 🧠 CoreCognition Benchmark
3123

32-
```bash
33-
# Navigate to pages directory
34-
cd pages
24+
The **CoreCognition** benchmark evaluates twelve foundational cognitive concepts:
25+
26+
1. **Permanence** - Objects persist when not perceived
27+
2. **Continuity** - Objects remain unified across space and time
28+
3. **Boundary** - Transitions between objects
29+
4. **Spatiality** - Understanding Euclidean properties
30+
5. **Perceptual Constancy** - Appearance changes ≠ property changes
31+
6. **Intuitive Physics** - Laws of physical interaction
32+
7. **Perspective** - Seeing what others see
33+
8. **Hierarchy** - Inclusion/exclusion of objects and categories
34+
9. **Conservation** - Property invariances despite transformations
35+
10. **Tool Use** - Manipulating objects to achieve goals
36+
11. **Intentionality** - Understanding what others want
37+
12. **Mechanical Reasoning** - Inferring actions from system states
38+
39+
## 🔬 Concept Hacking
3540

36-
# Run setup script (macOS/Linux)
37-
chmod +x setup.sh
38-
./setup.sh
41+
We introduce **Concept Hacking**, a novel controlled evaluation method that systematically manipulates task-relevant features while preserving task-irrelevant conditions. This reveals that MLLMs fail to develop genuine core knowledge understanding and instead rely on shortcut learning as they scale.
42+
43+
## 📊 Evaluation Scale
44+
45+
- **230 MLLMs** evaluated across different model families and sizes
46+
- **11 different prompts** to ensure robust evaluation
47+
- **>26,000 total judgments** across all models and tasks
48+
- **2,530 image-question pairs** in the benchmark
49+
50+
## 🏗️ Website Structure
3951

40-
# Start development server
41-
bundle exec jekyll serve
4252
```
53+
├── _config.yml # Jekyll configuration
54+
├── _layouts/
55+
│ └── default.html # Main layout template
56+
├── index.html # Homepage with full paper content
57+
├── assets/
58+
│ ├── images/ # Paper figures and illustrations
59+
│ ├── growai.png # Site favicon
60+
│ └── favicon.svg # Backup favicon
61+
├── Gemfile # Ruby dependencies
62+
└── README.md # This file
63+
```
64+
65+
## 🚀 Local Development
4366

44-
### Manual Setup
67+
To run the website locally:
4568

4669
```bash
47-
# Install Ruby dependencies
70+
# Install dependencies
4871
gem install jekyll bundler
4972
bundle install
5073

5174
# Serve the site locally
5275
bundle exec jekyll serve --livereload
5376
```
5477

55-
Then visit: `http://localhost:4000/RewardAnything`
78+
Then visit: `http://localhost:4000/core-knowledge`
5679

57-
## 📝 Configuration
80+
## 👥 Authors
5881

59-
### GitHub Pages Settings
82+
**Yijiang Li¹**, **Qingying Gao²,§**, **Tianwei Zhao²,§**, **Bingyang Wang³,§**, **Haoran Sun²**, **Haiyun Lyu⁴**, **Robert D. Hawkins⁵**, **Nuno Vasconcelos¹**, **Tal Golan⁶**, **Dezhi Luo⁷,⁸,†**, **Hokin Deng⁹,†**
6083

61-
1. Go to **Repository Settings****Pages**
62-
2. Source: **GitHub Actions**
63-
3. The workflow will handle the rest automatically
84+
¹University of California San Diego, ²Johns Hopkins University, ³Emory University, ⁴University of North Carolina at Chapel Hill, ⁵Stanford University, ⁶Ben-Gurion University of the Negev, ⁷University of Michigan, ⁸University College London, ⁹Carnegie Mellon University
6485

65-
### Environment Variables
86+
§Equal Contribution, †Corresponding author
6687

67-
The following are configured in `_config.yml`:
88+
## 📄 Citation
6889

69-
- `github_username`: Your GitHub username
70-
- `paper_url`: Link to your arXiv paper
71-
- `huggingface_url`: Link to model weights
72-
- `pypi_url`: Link to PyPI package
90+
If you find this work useful in your research, please consider citing:
7391

74-
## 🎨 Customization
75-
76-
### Replacing Placeholder Images
77-
78-
Replace the SVG placeholders in `assets/images/` with your actual logos:
79-
80-
- `logo-placeholder.svg` → Navigation logo
81-
- `logo-placeholder-white.svg` → Footer logo (white version)
82-
- `hero-logo-placeholder.svg` → Large hero section logo
83-
- `favicon.svg` → Browser favicon
84-
85-
### Updating Content
86-
87-
- **Homepage**: Edit `index.html`
88-
- **Navigation**: Modify `_layouts/default.html`
89-
- **Site settings**: Update `_config.yml`
90-
- **Styling**: Customize Tailwind classes in templates
91-
92-
### Adding New Pages
93-
94-
Create new `.html` or `.md` files with front matter:
95-
96-
```yaml
97-
---
98-
layout: default
99-
title: "Page Title"
100-
description: "Page description"
101-
---
102-
103-
Your content here...
92+
```bibtex
93+
@article{li2025core,
94+
title={Core Knowledge Deficits in Multi-Modal Language Models},
95+
author={Li, Yijiang and Gao, Qingying and Zhao, Tianwei and Wang, Bingyang and Sun, Haoran and Lyu, Haiyun and Luo, Dezhi and Deng, Hokin},
96+
journal={arXiv preprint arXiv:2410.10855},
97+
year={2025}
98+
}
10499
```
105100

106-
## 🔧 Troubleshooting
101+
## 📧 Contact
107102

108-
### Local Development Issues
103+
For questions about the paper or dataset, please contact the corresponding authors:
104+
- Dezhi Luo: [dezhi@umich.edu](mailto:dezhi@umich.edu)
105+
- Hokin Deng: [hokindeng@cmu.edu](mailto:hokindeng@cmu.edu)
109106

110-
```bash
111-
# Clean build files
112-
bundle exec jekyll clean
113-
114-
# Rebuild dependencies
115-
bundle install --force
116-
117-
# Verbose build for debugging
118-
bundle exec jekyll serve --verbose
119-
```
120-
121-
### Deployment Issues
122-
123-
1. Check **Actions** tab for build logs
124-
2. Ensure `pages/` directory changes are pushed to `main`
125-
3. Verify GitHub Pages settings are correct
107+
## 🔧 Technical Details
126108

127-
## 📊 Performance
128-
129-
The site is optimized for:
130-
- ✅ Mobile responsiveness
131-
- ✅ Fast loading (Tailwind CSS via CDN)
132-
- ✅ SEO optimization
133-
- ✅ Accessibility
134-
- ✅ Modern browsers
135-
136-
## 🤝 Contributing
137-
138-
When making changes:
139-
140-
1. Test locally first: `bundle exec jekyll serve`
141-
2. Commit changes to `pages/` directory
142-
3. Push to `main` branch
143-
4. Automatic deployment will trigger
109+
The website is built with:
110+
- **Jekyll** for static site generation
111+
- **Tailwind CSS** for styling
112+
- **GitHub Pages** for hosting
113+
- **Responsive design** optimized for all devices
114+
- **SEO optimization** for better discoverability
144115

145116
---
117+
118+
*This website presents the official results and findings from our comprehensive evaluation of multi-modal language models on core cognitive abilities.*

_config.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
title: Core Cognition"
1+
title: "Core Cognition"
22
description: "Core Knowledge Deficits in Multi-Modal Language Models"
3-
url: "https://grow-ai-like-a-child.github.io"
4-
baseurl: "/CoreCognition"
3+
url: "https://williamium3000.github.io"
4+
baseurl: "/core-knowledge"
55

66
# Build settings
77
markdown: kramdown
@@ -30,7 +30,7 @@ exclude:
3030
- README.md
3131

3232
# Social links
33-
github_username: grow-ai-like-a-child
33+
github_username: williamium3000
3434
paper_url: "https://arxiv.org/abs/2410.10855"
3535
huggingface_url: "https://huggingface.co/grow-ai-like-a-child"
3636
pypi_url: ""

0 commit comments

Comments
 (0)