diff --git a/.DS_Store b/.DS_Store
index 69379217..9f867ab4 100644
Binary files a/.DS_Store and b/.DS_Store differ
diff --git a/mmevol/README.md b/mmevol/README.md
index 9202f1af..f7c8b1a6 100644
--- a/mmevol/README.md
+++ b/mmevol/README.md
@@ -1,6 +1,9 @@
# MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
+
+
+
-
-
-
[[๐ arXiv Paper](https://arxiv.org/pdf/2409.05840)] [[๐ Dataset](https://huggingface.co/datasets/Tongyi-ConvAI/MMEvol)] [[๐ Models](https://huggingface.co/models/Tongyi-ConvAI/MMEvol)]
-MMEvol is the first method that successfully introduces Evol-Instruct into multimodal domain to improve the diversity and complexity of multimodal instruction data. Compared with previous methods like vila2, MIMIC-IT, and MMInstruct, it can perform iterative evolution in a very elegant and simple way in a fully automatic way, breaking through human imagination of data complexity and diversity. It has no restrictions on the form of data, the type of task, or complex processing. It can quickly perform self-iterative evolution on limited image instruction data to obtain ultra-high-quality multimodal data, thereby giving multimodal models more powerful capabilities. At the same time, it can be orthogonally combined with other data flow-driven methods such as vila2, MIMIC-IT, and MMInstruct to obtain more powerful data construction effects. Everyone is welcome to experience it now!
+MMEvol is the first method that successfully introduces Evol-Instruct into multimodal domain to improve the diversity and complexity of multimodal instruction data. Compared with previous methods like VILA2, MIMIC-IT, and MMInstruct, it can perform iterative evolution in a very elegant and simple way in a fully automatic way, breaking through human imagination of data complexity and diversity. It has no restrictions on the form of data, the type of task, or complex processing. It can quickly perform self-iterative evolution on limited image instruction data to obtain ultra-high-quality multimodal data, thereby giving multimodal models more powerful capabilities. At the same time, it can be orthogonally combined with other data flow-driven methods such as VILA2, MIMIC-IT, and MMInstruct to obtain more powerful data construction effects. Everyone is welcome to experience it now!
## ๐ฅ Update
@@ -103,8 +101,8 @@ Here are the pretrained weights and instruction tuning weights
| Model | Pretrained Projector | Base LLM | PT Data | IT Data | Download |
| ---------------- | -------------------- | --------- | ------------------------------------------------------------ | ------- | -------- |
-| MMEvol-Qwen2-7B | [mm_projector]() | Qwen2-7B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt]() |
-| MMEvol-LLaMA3-8B | [mm_projector]() | LLaMA3-8B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt]() |
+| MMEvol-Qwen2-7B | [mm_projector](https://huggingface.co/models/Tongyi-ConvAI/MMEvol) | Qwen2-7B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/models/Tongyi-ConvAI/MMEvol) |
+| MMEvol-LLaMA3-8B | [mm_projector](https://huggingface.co/models/Tongyi-ConvAI/MMEvol) | LLaMA3-8B | [LLaVA-Pretrain](https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain) | MMEvol | [ckpt](https://huggingface.co/models/Tongyi-ConvAI/MMEvol) |
### Performance
@@ -255,9 +253,10 @@ bash scripts/v1_6/train/llama3/finetune.sh
bash scripts/v1_6/train/qwen2/finetune.sh
```
-
## ๐ Evaluation
+#### Ensure that your api_base and key are correctly configured before evaluation.
+
## opencompass
First, enter the `vlmevalkit` directory and install all dependencies:
@@ -313,6 +312,8 @@ While scoring on each benchmark directly, set `MODE=all`. If only inference resu
./script/run_inference.sh MMEvol-Llama3-V-1_6 MathVista_MINI all
.....
+# NOTE you should use llava/eval/blink_eval.py for blink evaluation individually.
+python llava/eval/blink_eval.py
```
@@ -335,22 +336,24 @@ python llava/eval/mminst_eval.py
+
+
## ๐ Visualization
The Tongyi-ConvAI generates this dataset for multi-modal supervised fine-tuning. This dataset was used to train **Evol-Llama3-8B-Instruct** and **Evol-Qwen2-7B** reported in [our paper](https://arxiv.org/pdf/2409.05840). To create this dataset, we first selected 163K Seed Instruction Tuning Dataset for Evol-Instruct, then we enhance data quality through an iterative process that involves a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution. This process results in the generation of a more complex and diverse image-text instruction dataset, which in turn empowers MLLMs with enhanced capabilities. Below we showcase the detailed data distribution of the SEED-163K, which is prepared for multi-round evolution mentioned above. More details can be found in our paper.
-
+
diff --git a/mmevol/dataengine/README.md b/mmevol/dataengine/README.md
new file mode 100644
index 00000000..0aeb4246
--- /dev/null
+++ b/mmevol/dataengine/README.md
@@ -0,0 +1,79 @@
+# Data construction pipeline for MMEvol-480k.
+
+
+
+
+
+
+
+
Run Luo1,2*,
+
Haonan Zhang3*,
+
Longze Chen1,2*,
+
Ting-En Lin3*,
+
Xiong Liu3,
+
Yuchuan Wu3,
+
Min Yang1,2๐,
+
Yongbin Li3๐,
+
+
Minzheng Wang2,
+Pengpeng Zeng4,
+Lianli Gao5,
+Heng Tao Shen4,
+Yunshui Li1,2,
+Xiaobo Xia6,
+FeiHuang3,
+Jingkuan Song4๐,
+
+
+\* Equal contribution ๐ Corresponding author
+
+1 Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
+2 University of Chinese Academy of Sciences
+3 Alibaba Group
+4 Tongji University
+5 Independent Researcher
+6 The University of Sydney
+
+![Multi-Modal](https://img.shields.io/badge/Task-Multi--Modal-red)
+
+
+
+
+ [[๐ arXiv Paper](https://arxiv.org/pdf/2409.05840)] [[๐ Dataset](https://huggingface.co/datasets/Tongyi-ConvAI/MMEvol)] [[๐ Models](https://huggingface.co/models/Tongyi-ConvAI/MMEvol)]
+
+Follow the instructions below to generate MMEvol-480k.
+
+1. Download SEED-163k json file (`mm_seed_no_evo_163k.json`) from [๐ค huggingface](https://huggingface.co/datasets/Tongyi-ConvAI/MMEvol/tree/main/jsons), and place it under the `./dataengine/datasets` path.
+2. Execute preprocessing code under `dataengine/datasets` path to extract each sample to the `meta_data` folder by:
+```python
+python dataengine/datasets/process.py
+```
+3. Prepare the data storage folder by referring to the format of `./dataengine/evolution/folder_template`, you can just copy folder_template and name it as your data name as you like, _e.g._, mmevol_1k_evo.json.
+4. Ensure that your `api_base` and `key` are correctly configured before starting generation. You should put your key and api_base on both:
+
+- lines 129-130 in dataengine/multi_round.py
+- lines 126-127 in dataengine/score_process/difficulty_scoring_v123.py
+5. Run the following code to begin the three-round data evolution:
+```python
+python dataengine/multi_round.py
+```
+Three rounds of evolution will be performed based on the SEED-163k, and data filtering will be performed at the end of each round of evolution. The final evolution data will be stored under `./datasets` paths
+
+**License**: Please follow [Meta Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) and [Gemma License](https://www.kaggle.com/models/google/gemma/license/).
+
+## ๐ Citation
+
+```bibtex
+@article{luo2024mmevol,
+ title={Mmevol: Empowering multimodal large language models with evol-instruct},
+ author={Luo, Run and Zhang, Haonan and Chen, Longze and Lin, Ting-En and Liu, Xiong and Wu, Yuchuan and Yang, Min and Wang, Minzheng and Zeng, Pengpeng and Gao, Lianli and others},
+ journal={arXiv preprint arXiv:2409.05840},
+ year={2024}
+}
+```
+
+**Contact**:
+
+- Run Luo โ r.luo@siat.ac.cn
+
+- Haonan Zhang โ zchiowal@gmail.com
diff --git a/mmevol/dataengine/assets/mmevol_dis_cam.png b/mmevol/dataengine/assets/mmevol_dis_cam.png
new file mode 100644
index 00000000..d63f4277
Binary files /dev/null and b/mmevol/dataengine/assets/mmevol_dis_cam.png differ
diff --git a/mmevol/dataengine/assets/mmevol_logo.png b/mmevol/dataengine/assets/mmevol_logo.png
new file mode 100644
index 00000000..76ec2126
Binary files /dev/null and b/mmevol/dataengine/assets/mmevol_logo.png differ
diff --git a/mmevol/dataengine/assets/mmevol_long_tail.png b/mmevol/dataengine/assets/mmevol_long_tail.png
new file mode 100644
index 00000000..30e96b2e
Binary files /dev/null and b/mmevol/dataengine/assets/mmevol_long_tail.png differ
diff --git a/mmevol/dataengine/assets/mmevol_pai.png b/mmevol/dataengine/assets/mmevol_pai.png
new file mode 100644
index 00000000..e1070bd6
Binary files /dev/null and b/mmevol/dataengine/assets/mmevol_pai.png differ
diff --git a/mmevol/dataengine/assets/mmevol_performance.png b/mmevol/dataengine/assets/mmevol_performance.png
new file mode 100644
index 00000000..a2795f93
Binary files /dev/null and b/mmevol/dataengine/assets/mmevol_performance.png differ
diff --git a/mmevol/mmevol_sft_data/assets/seed_dis.jpg b/mmevol/dataengine/assets/mmevol_seed_dis.jpg
similarity index 100%
rename from mmevol/mmevol_sft_data/assets/seed_dis.jpg
rename to mmevol/dataengine/assets/mmevol_seed_dis.jpg
diff --git a/mmevol/mmevol_sft_data/base.py b/mmevol/dataengine/base.py
similarity index 100%
rename from mmevol/mmevol_sft_data/base.py
rename to mmevol/dataengine/base.py
diff --git a/mmevol/dataengine/datasets/process.py b/mmevol/dataengine/datasets/process.py
new file mode 100644
index 00000000..866583af
--- /dev/null
+++ b/mmevol/dataengine/datasets/process.py
@@ -0,0 +1,24 @@
+import json
+import os
+import os.path as osp
+from tqdm import tqdm
+import shutil
+
+# Construct hash_id to create a unique index, because both id and image key values โโhave duplicate values
+datasets_path = "/mnt/data/haonan/code/dataengine/datasets"
+
+a = json.load(open(osp.join(datasets_path, "seed_data_1k_demo.json"), "r"))
+for index, i in enumerate(a):
+ i["hash_id"] = str(index) + "_" + i["image"].replace("/", "_")
+
+json.dump(a, open("/mnt/data/haonan/code/dataengine/datasets/seed_data_1k_demo.json", "w"), indent=4)
+
+# If the data format is already well organized, store it separately in meta data
+if os.path.exists(osp.join(datasets_path, "meta_data")):
+ shutil.rmtree(osp.join(datasets_path, "meta_data"))
+ os.mkdir(osp.join(datasets_path, "meta_data"))
+
+data = json.load(open(osp.join(datasets_path, "seed_data_1k_demo.json"), "r"))
+
+for index, d in enumerate(tqdm(data)):
+ json.dump(d, open(osp.join(datasets_path, "meta_data", "{}.json".format(d["hash_id"])), "w"), indent=4)
\ No newline at end of file
diff --git a/mmevol/mmevol_sft_data/multi_round.py b/mmevol/dataengine/multi_round.py
similarity index 98%
rename from mmevol/mmevol_sft_data/multi_round.py
rename to mmevol/dataengine/multi_round.py
index bde792a4..793f62c1 100644
--- a/mmevol/mmevol_sft_data/multi_round.py
+++ b/mmevol/dataengine/multi_round.py
@@ -1,6 +1,6 @@
import os
import sys
-sys.path.append("/mnt/data/haonan/code/mmevol_sft_data")
+sys.path.append("/mnt/data/haonan/code/dataengine")
from base import BaseAPI
import numpy as np
from tqdm import tqdm
@@ -466,13 +466,13 @@ def filter_round3(meta_data, conversation_v3_path):
if __name__=='__main__':
- final_save_path = "/mnt/data/haonan/code/mmevol_sft_data/datasets/seed_data_1k_demo_evo.json"
- root_path = '/mnt/data/haonan/code/mmevol_sft_data/evolution/multi_round_single_imgs_1k_mini'
+ final_save_path = "/mnt/data/haonan/code/dataengine/datasets/seed_data_1k_demo_evo.json"
+ root_path = '/mnt/data/haonan/code/dataengine/evolution/multi_round_single_imgs_1k_mini'
img_path = '/mnt/workspace/lr/datasets'
for round_n in [1,2,3]:
if round_n == 1:
- seed_data_path = "/mnt/data/haonan/code/mmevol_sft_data/datasets/meta_data"
+ seed_data_path = "/mnt/data/haonan/code/dataengine/datasets/meta_data"
else:
seed_data_path = osp.join(root_path, "round{}".format(round_n-1), "filtered_qa")
@@ -534,4 +534,4 @@ def filter_round3(meta_data, conversation_v3_path):
merged_data.append(data)
json.dump(merged_data, open(final_save_path, "w"), indent=4)
- print("Saveing file to {}".format(final_save_path))
+ print("Saveing file to {}".format(final_save_path))
\ No newline at end of file
diff --git a/mmevol/mmevol_sft_data/prompt.py b/mmevol/dataengine/prompt.py
similarity index 100%
rename from mmevol/mmevol_sft_data/prompt.py
rename to mmevol/dataengine/prompt.py
diff --git a/mmevol/mmevol_sft_data/score_process/base.py b/mmevol/dataengine/score_process/base.py
similarity index 100%
rename from mmevol/mmevol_sft_data/score_process/base.py
rename to mmevol/dataengine/score_process/base.py
diff --git a/mmevol/mmevol_sft_data/score_process/difficulty_scoring_v0.py b/mmevol/dataengine/score_process/difficulty_scoring_v0.py
similarity index 92%
rename from mmevol/mmevol_sft_data/score_process/difficulty_scoring_v0.py
rename to mmevol/dataengine/score_process/difficulty_scoring_v0.py
index 98e47f04..e38cefc6 100644
--- a/mmevol/mmevol_sft_data/score_process/difficulty_scoring_v0.py
+++ b/mmevol/dataengine/score_process/difficulty_scoring_v0.py
@@ -124,12 +124,9 @@ def __init__(self,
print('Unknown API Base. ')
sys.exit(-1)
- self.api_base="http://47.88.8.18:8088/api/ask"
- # self.api_base = "http://47.88.8.18:8088/api/ask?tenant=gpt-4o-mini"
- # self.key = "eyJ0eXAiOiJqd3QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6IjI1ODczMCIsInBhc3N3b3JkIjoiMjU4NzMwMTIzIiwiZXhwIjoyMDE5NTUwNzAxfQ.JuqnTa7yauGkSzWkBiEig1K_rxvfAYTXS9F9_m-h4q8"
- # self.key = "eyJ0eXAiOiJqd3QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6IjI3NDM2OCIsInBhc3N3b3JkIjoiMjc0MzY4MTIzIiwiZXhwIjoyMDEyNjEzNjA4fQ.7OUpHs-AFPaFHuUy_p7XxXyNYhca2_-7F5GBtaahfe4"
- self.key = "eyJhbGciOiJIUzI1NiIsInR5cCI6Imp3dCJ9.eyJ1c2VybmFtZSI6IjQ0MzQ1NSIsInBhc3N3b3JkIjoiNDQzNDU1MTIzIiwiZXhwIjoyMDMxNzA1NTA3fQ.7g4a6t9dKcRXVRa7MwQb5m2oirFu1OxjXhWbNM0w50s"
- # self.key = "eyJhbGciOiJIUzI1NiIsInR5cCI6Imp3dCJ9.eyJ1c2VybmFtZSI6IjQzOTg2OSIsInBhc3N3b3JkIjoiNDM5ODY5MTIzIiwiZXhwIjoyMDMxNzA3NjkzfQ.ly9XNzVW7pEeW_bTZxzaqB3jt2kRr14XQIpT0DbCTto"
+ self.api_base = ""
+ self.key = ""
+
# self.model = "gpt-4o-2024-08-06"
self.model = "gpt-4o-mini"
diff --git a/mmevol/mmevol_sft_data/score_process/difficulty_scoring_v123.py b/mmevol/dataengine/score_process/difficulty_scoring_v123.py
similarity index 95%
rename from mmevol/mmevol_sft_data/score_process/difficulty_scoring_v123.py
rename to mmevol/dataengine/score_process/difficulty_scoring_v123.py
index 75536e29..09fdb208 100644
--- a/mmevol/mmevol_sft_data/score_process/difficulty_scoring_v123.py
+++ b/mmevol/dataengine/score_process/difficulty_scoring_v123.py
@@ -123,10 +123,9 @@ def __init__(self,
print('Unknown API Base. ')
sys.exit(-1)
- self.api_base="http://47.88.8.18:8088/api/ask"
- # self.api_base = "http://47.88.8.18:8088/api/ask?tenant=gpt-4o-mini"
- # self.key = "eyJ0eXAiOiJqd3QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VybmFtZSI6IjI1ODczMCIsInBhc3N3b3JkIjoiMjU4NzMwMTIzIiwiZXhwIjoyMDE5NTUwNzAxfQ.JuqnTa7yauGkSzWkBiEig1K_rxvfAYTXS9F9_m-h4q8"
- self.key = "eyJhbGciOiJIUzI1NiIsInR5cCI6Imp3dCJ9.eyJ1c2VybmFtZSI6IjQ0MzQ1NSIsInBhc3N3b3JkIjoiNDQzNDU1MTIzIiwiZXhwIjoyMDMxNzA1NTA3fQ.7g4a6t9dKcRXVRa7MwQb5m2oirFu1OxjXhWbNM0w50s"
+ self.api_base = ""
+ self.key = ""
+
# self.model="gpt-4o-2024-05-13"
self.model = "gpt-4o-mini"
diff --git a/mmevol/mmevol_sft_data/score_process/prompt_score.py b/mmevol/dataengine/score_process/prompt_score.py
similarity index 100%
rename from mmevol/mmevol_sft_data/score_process/prompt_score.py
rename to mmevol/dataengine/score_process/prompt_score.py
diff --git a/mmevol/mmevol_sft_data/utils/a.ipynb b/mmevol/dataengine/utils/a.ipynb
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/a.ipynb
rename to mmevol/dataengine/utils/a.ipynb
diff --git a/mmevol/mmevol_sft_data/utils/bertopic.ipynb b/mmevol/dataengine/utils/bertopic.ipynb
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/bertopic.ipynb
rename to mmevol/dataengine/utils/bertopic.ipynb
diff --git a/mmevol/mmevol_sft_data/utils/coco_80_labels.txt b/mmevol/dataengine/utils/coco_80_labels.txt
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/coco_80_labels.txt
rename to mmevol/dataengine/utils/coco_80_labels.txt
diff --git a/mmevol/mmevol_sft_data/utils/data_process.py b/mmevol/dataengine/utils/data_process.py
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/data_process.py
rename to mmevol/dataengine/utils/data_process.py
diff --git a/mmevol/mmevol_sft_data/utils/object_count.json b/mmevol/dataengine/utils/object_count.json
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/object_count.json
rename to mmevol/dataengine/utils/object_count.json
diff --git a/mmevol/mmevol_sft_data/utils/small_obj.txt b/mmevol/dataengine/utils/small_obj.txt
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/small_obj.txt
rename to mmevol/dataengine/utils/small_obj.txt
diff --git a/mmevol/mmevol_sft_data/utils/small_obj_process.txt b/mmevol/dataengine/utils/small_obj_process.txt
similarity index 100%
rename from mmevol/mmevol_sft_data/utils/small_obj_process.txt
rename to mmevol/dataengine/utils/small_obj_process.txt
diff --git a/mmevol/llava/eval/mmvp_eval.py b/mmevol/llava/eval/mmvp_eval.py
index 7b9967a1..734ce24f 100644
--- a/mmevol/llava/eval/mmvp_eval.py
+++ b/mmevol/llava/eval/mmvp_eval.py
@@ -109,11 +109,12 @@ def make_request(meta):
with Pool(processes=50) as pool:
output = list(tqdm(pool.imap(make_request, data), total=len(data)))
-print(output)
-for i in set(all_types):
+# print(output)
+# for i in set(all_types):
- for j in data:
- if j['type']==i
+# for j in data:
+# if j['type']==i
+
num_correct, num_total = 0, 0
# Continue with the processing of the JSONL file
index=0
diff --git a/mmevol/mmevol_sft_data/README.md b/mmevol/mmevol_sft_data/README.md
deleted file mode 100644
index c5dd04e5..00000000
--- a/mmevol/mmevol_sft_data/README.md
+++ /dev/null
@@ -1,51 +0,0 @@
-
-
-
-
-
-# MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
-
-This is the official data collection of the paper "MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct", the dataset and checkpoint will be released soon.
-
-We are continuously refactoring our code, be patient and wait for the latest updates!
-
-## ๐ Links
-- Project Web: https://mmevol.github.io/
-
-- Arxiv Paper: https://arxiv.org/pdf/2409.05840
-
-- Code: Coming soon
-
-## ๐งช Dataset Details
-
-The Tongyi-ConvAI generates this dataset for multi-modal supervised fine-tuning. This dataset was used to train **Evol-Llama3-8B-Instruct** and **Evol-Qwen2-7B** reported in [our paper](https://arxiv.org/pdf/2409.05840).
-
-To create this dataset, we first selected 163K Seed Instruction Tuning Dataset for Evol-Instruct, then we enhance data quality through an iterative process that involves a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution. This process results in the generation of a more complex and diverse image-text instruction dataset, which in turn empowers MLLMs with enhanced capabilities.
-
-Below we showcase the detailed data distribution of the SEED-163K, which is prepared for multi-round evolution mentioned above:
-
-
-
- Fig. 2. SEED-163K: 163K Curated Seed Instruction Tuning Dataset for Evol-Instruct
-
-
-
-
-**License**: Please follow [Meta Llama 3.1 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) and [Gemma License](https://www.kaggle.com/models/google/gemma/license/).
-
-## ๐ Citation
-
-```bibtex
-@article{luo2024mmevol,
- title={Mmevol: Empowering multimodal large language models with evol-instruct},
- author={Luo, Run and Zhang, Haonan and Chen, Longze and Lin, Ting-En and Liu, Xiong and Wu, Yuchuan and Yang, Min and Wang, Minzheng and Zeng, Pengpeng and Gao, Lianli and others},
- journal={arXiv preprint arXiv:2409.05840},
- year={2024}
-}
-```
-
-**Contact**:
-
-- Run Luo โ r.luo@siat.ac.cn
-
-- Haonan Zhang โ zchiowal@gmail.com
diff --git a/mmevol/mmevol_sft_data/assets/mmevol.jpg b/mmevol/mmevol_sft_data/assets/mmevol.jpg
deleted file mode 100644
index d280d886..00000000
Binary files a/mmevol/mmevol_sft_data/assets/mmevol.jpg and /dev/null differ
diff --git a/mmevol/mmevol_sft_data/datasets/process.ipynb b/mmevol/mmevol_sft_data/datasets/process.ipynb
deleted file mode 100644
index e1236f26..00000000
--- a/mmevol/mmevol_sft_data/datasets/process.ipynb
+++ /dev/null
@@ -1,65 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "100%|โโโโโโโโโโ| 1612/1612 [00:15<00:00, 103.12it/s]\n"
- ]
- }
- ],
- "source": [
- "import json\n",
- "import os\n",
- "import os.path as osp\n",
- "from tqdm import tqdm\n",
- "import shutil\n",
- "\n",
- "# Construct hash_id to create a unique index, because both id and image key values โโhave duplicate values\n",
- "datasets_path = \"/mnt/data/haonan/code/mmevol_sft_data/datasets\"\n",
- "\n",
- "a = json.load(open(osp.join(datasets_path, \"seed_data_1k_demo.json\"), \"r\"))\n",
- "for index, i in enumerate(a):\n",
- " i[\"hash_id\"] = str(index) + \"_\" + i[\"image\"].replace(\"/\", \"_\")\n",
- "\n",
- "json.dump(a, open(\"/mnt/data/haonan/code/mmevol_sft_data/datasets/seed_data_1k_demo.json\", \"w\"), indent=4)\n",
- "\n",
- "# If the data format is already well organized, store it separately in meta data\n",
- "if os.path.exists(osp.join(datasets_path, \"meta_data\")):\n",
- " shutil.rmtree(osp.join(datasets_path, \"meta_data\"))\n",
- " os.mkdir(osp.join(datasets_path, \"meta_data\"))\n",
- "\n",
- "data = json.load(open(osp.join(datasets_path, \"seed_data_1k_demo.json\"), \"r\"))\n",
- "\n",
- "for index, d in enumerate(tqdm(data)):\n",
- " json.dump(d, open(osp.join(datasets_path, \"meta_data\", \"{}.json\".format(d[\"hash_id\"])), \"w\"), indent=4)"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.10.14"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/mmevol/vlmevalkit/.DS_Store b/mmevol/vlmevalkit/.DS_Store
deleted file mode 100644
index f9863cd7..00000000
Binary files a/mmevol/vlmevalkit/.DS_Store and /dev/null differ
diff --git a/mmevol/vlmevalkit/vlmeval/.DS_Store b/mmevol/vlmevalkit/vlmeval/.DS_Store
deleted file mode 100644
index 171a3172..00000000
Binary files a/mmevol/vlmevalkit/vlmeval/.DS_Store and /dev/null differ
diff --git a/mmevol/vlmevalkit/vlmeval/api/gpt.py b/mmevol/vlmevalkit/vlmeval/api/gpt.py
index 14f67c09..a88e8820 100644
--- a/mmevol/vlmevalkit/vlmeval/api/gpt.py
+++ b/mmevol/vlmevalkit/vlmeval/api/gpt.py
@@ -91,15 +91,14 @@ def __init__(self,
else:
self.logger.error('Unknown API Base. ')
sys.exit(-1)
+
# your api_base
- self.api_base=""
+ self.api_base = ""
# your key
- self.key=""
+ self.key = ""
assert len(self.api_base)>0 and len(self.key)>0, "make sure tha both api_base and key are configured correctly"
-
-
# self.model="gpt-4o-2024-05-13"
model = "gpt-4o-mini"
self.logger.info(f'Using API Base: {self.api_base}; API Key: {self.key}')