Initial Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

LLM Generation

Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

Output

Here is the JSON metadata for the models:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}
Note: I assumed that the `dataset` field is empty for all models except for `RedPajama-INCITE-7B-Base` and `RedPajama-INCITE-Base-3B-v1`. If you have more information about the datasets used to train these models, feel free to update the `dataset` field accordingly.

Extract Json

I extracted the following JSON from that output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}

JSON Subset Evaluator

Testing if the previous output matches the JSON: { "Mistral-7B-v0.1": { "size": 7, "dataset": "", "family": "Mistral" }, "RedPajama-INCITE-7B-Base": { "size": 7, "dataset": "", "family": "RedPajama" }, "RedPajama-INCITE-Base-3B-v1": { "size": 3, "dataset": "", "family": "RedPajama" }, "falcon40b": { "size": 40, "dataset": "", "family": "falcon" }, "falcon7b": { "size": 7, "dataset": "", "family": "falcon" }, "gpt2-xl": { "size": 1.5, "dataset": "", "family": "gpt2" }, "llama-65b": { "size": 65, "dataset": "", "family": "llama" }, "llama-7b": { "size": 7, "dataset": "", "family": "llama" }, "neo-1.3": { "size": 1.3, "dataset": "", "family": "neo" }, "neo-2.7": { "size": 2.7, "dataset": "", "family": "neo" }, "neo-6": { "size": 6, "dataset": "", "family": "neo" }, "open_llama_3b_v2": { "size": 3, "dataset": "", "family": "open_llama" }, "open_llama_7b_v2": { "size": 7, "dataset": "", "family": "open_llama" }, "opt-1.3b": { "size": 1.3, "dataset": "", "family": "opt" }, "opt-6.7b": { "size": 6.7, "dataset": "", "family": "opt" }, "pythia-1.4": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-1.4-dedup": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-6.9": { "size": 6.9, "dataset": "", "family": "pythia" }, "pythia-6.9-dedup": { "size": 6.9, "dataset": "", "family": "pythia" } }: False

Initial Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

LLM Generation

Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

Output

Here is the JSON metadata for the provided models:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}
Note: The dataset field is left empty as no specific dataset information was provided for each model. Confidence: 99%

Extract Json

I extracted the following JSON from that output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}

JSON Subset Evaluator

Testing if the previous output matches the JSON: { "Mistral-7B-v0.1": { "size": 7, "dataset": "", "family": "Mistral" }, "RedPajama-INCITE-7B-Base": { "size": 7, "dataset": "", "family": "RedPajama" }, "RedPajama-INCITE-Base-3B-v1": { "size": 3, "dataset": "", "family": "RedPajama" }, "falcon40b": { "size": 40, "dataset": "", "family": "falcon" }, "falcon7b": { "size": 7, "dataset": "", "family": "falcon" }, "gpt2-xl": { "size": 1.5, "dataset": "", "family": "gpt2" }, "llama-65b": { "size": 65, "dataset": "", "family": "llama" }, "llama-7b": { "size": 7, "dataset": "", "family": "llama" }, "neo-1.3": { "size": 1.3, "dataset": "", "family": "neo" }, "neo-2.7": { "size": 2.7, "dataset": "", "family": "neo" }, "neo-6": { "size": 6, "dataset": "", "family": "neo" }, "open_llama_3b_v2": { "size": 3, "dataset": "", "family": "open_llama" }, "open_llama_7b_v2": { "size": 7, "dataset": "", "family": "open_llama" }, "opt-1.3b": { "size": 1.3, "dataset": "", "family": "opt" }, "opt-6.7b": { "size": 6.7, "dataset": "", "family": "opt" }, "pythia-1.4": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-1.4-dedup": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-6.9": { "size": 6.9, "dataset": "", "family": "pythia" }, "pythia-6.9-dedup": { "size": 6.9, "dataset": "", "family": "pythia" } }: False

Initial Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

LLM Generation

Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

Output

Here is the JSON metadata for the given models:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}
Note that the dataset field is left empty for most models, as the provided names do not specify the dataset used for training. You can update it if you have more information on the datasets used for each model.

Extract Json

I extracted the following JSON from that output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}

JSON Subset Evaluator

Testing if the previous output matches the JSON: { "Mistral-7B-v0.1": { "size": 7, "dataset": "", "family": "Mistral" }, "RedPajama-INCITE-7B-Base": { "size": 7, "dataset": "", "family": "RedPajama" }, "RedPajama-INCITE-Base-3B-v1": { "size": 3, "dataset": "", "family": "RedPajama" }, "falcon40b": { "size": 40, "dataset": "", "family": "falcon" }, "falcon7b": { "size": 7, "dataset": "", "family": "falcon" }, "gpt2-xl": { "size": 1.5, "dataset": "", "family": "gpt2" }, "llama-65b": { "size": 65, "dataset": "", "family": "llama" }, "llama-7b": { "size": 7, "dataset": "", "family": "llama" }, "neo-1.3": { "size": 1.3, "dataset": "", "family": "neo" }, "neo-2.7": { "size": 2.7, "dataset": "", "family": "neo" }, "neo-6": { "size": 6, "dataset": "", "family": "neo" }, "open_llama_3b_v2": { "size": 3, "dataset": "", "family": "open_llama" }, "open_llama_7b_v2": { "size": 7, "dataset": "", "family": "open_llama" }, "opt-1.3b": { "size": 1.3, "dataset": "", "family": "opt" }, "opt-6.7b": { "size": 6.7, "dataset": "", "family": "opt" }, "pythia-1.4": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-1.4-dedup": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-6.9": { "size": 6.9, "dataset": "", "family": "pythia" }, "pythia-6.9-dedup": { "size": 6.9, "dataset": "", "family": "pythia" } }: False

Initial Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

LLM Generation

Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

Output

Here is the JSON metadata for the models you provided:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}
Note that I have assumed that the dataset for most models is not specified, and that the size for `gpt2-xl` is 1.5 billion parameters, as you mentioned.

Extract Json

I extracted the following JSON from that output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}

JSON Subset Evaluator

Testing if the previous output matches the JSON: { "Mistral-7B-v0.1": { "size": 7, "dataset": "", "family": "Mistral" }, "RedPajama-INCITE-7B-Base": { "size": 7, "dataset": "", "family": "RedPajama" }, "RedPajama-INCITE-Base-3B-v1": { "size": 3, "dataset": "", "family": "RedPajama" }, "falcon40b": { "size": 40, "dataset": "", "family": "falcon" }, "falcon7b": { "size": 7, "dataset": "", "family": "falcon" }, "gpt2-xl": { "size": 1.5, "dataset": "", "family": "gpt2" }, "llama-65b": { "size": 65, "dataset": "", "family": "llama" }, "llama-7b": { "size": 7, "dataset": "", "family": "llama" }, "neo-1.3": { "size": 1.3, "dataset": "", "family": "neo" }, "neo-2.7": { "size": 2.7, "dataset": "", "family": "neo" }, "neo-6": { "size": 6, "dataset": "", "family": "neo" }, "open_llama_3b_v2": { "size": 3, "dataset": "", "family": "open_llama" }, "open_llama_7b_v2": { "size": 7, "dataset": "", "family": "open_llama" }, "opt-1.3b": { "size": 1.3, "dataset": "", "family": "opt" }, "opt-6.7b": { "size": 6.7, "dataset": "", "family": "opt" }, "pythia-1.4": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-1.4-dedup": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-6.9": { "size": 6.9, "dataset": "", "family": "pythia" }, "pythia-6.9-dedup": { "size": 6.9, "dataset": "", "family": "pythia" } }: False

Initial Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

LLM Generation

Query

Create a JSON metadata for these models: Mistral-7B-v0.1 RedPajama-INCITE-7B-Base RedPajama-INCITE-Base-3B-v1 falcon40b falcon7b gpt2-xl llama-65b llama-7b neo-1.3 neo-2.7 neo-6 open_llama_3b_v2 open_llama_7b_v2 opt-1.3b opt-6.7b pythia-1.4 pythia-1.4-dedup pythia-6.9 pythia-6.9-dedup With the format: {"Mistral-7B-v0.1": {"size": 7, dataset: "", "family": "Mistral"}, ...} where family is one of base = [ 'pythia', 'llama', 'Mistral', 'gpt2', 'opt', 'RedPajama', 'neo', 'open_llama', 'falcon' ] gpt2-xl is 1.5b parameters.

Output

Here is the JSON metadata for the given models:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}
In this JSON, the size field is in billions, but for `gpt2-xl`, I put the size as 1.5 instead of 1.5 billion. If you want to keep the sizes in billions, you can change that value to 1.5 and add a "b" to the end of the value in the JSON key for clarity. Note that the dataset field is left empty for most models since you didn't provide any information on the specific datasets used for training. If you have dataset information for each model, you can fill in the dataset field accordingly.

Extract Json

I extracted the following JSON from that output:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
{
  "Mistral-7B-v0.1": {
    "size": 7,
    "dataset": "",
    "family": "Mistral"
  },
  "RedPajama-INCITE-7B-Base": {
    "size": 7,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "RedPajama-INCITE-Base-3B-v1": {
    "size": 3,
    "dataset": "INCITE",
    "family": "RedPajama"
  },
  "falcon40b": {
    "size": 40,
    "dataset": "",
    "family": "falcon"
  },
  "falcon7b": {
    "size": 7,
    "dataset": "",
    "family": "falcon"
  },
  "gpt2-xl": {
    "size": 1.5,
    "dataset": "",
    "family": "gpt2"
  },
  "llama-65b": {
    "size": 65,
    "dataset": "",
    "family": "llama"
  },
  "llama-7b": {
    "size": 7,
    "dataset": "",
    "family": "llama"
  },
  "neo-1.3": {
    "size": 1.3,
    "dataset": "",
    "family": "neo"
  },
  "neo-2.7": {
    "size": 2.7,
    "dataset": "",
    "family": "neo"
  },
  "neo-6": {
    "size": 6,
    "dataset": "",
    "family": "neo"
  },
  "open_llama_3b_v2": {
    "size": 3,
    "dataset": "",
    "family": "open_llama"
  },
  "open_llama_7b_v2": {
    "size": 7,
    "dataset": "",
    "family": "open_llama"
  },
  "opt-1.3b": {
    "size": 1.3,
    "dataset": "",
    "family": "opt"
  },
  "opt-6.7b": {
    "size": 6.7,
    "dataset": "",
    "family": "opt"
  },
  "pythia-1.4": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-1.4-dedup": {
    "size": 1.4,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  },
  "pythia-6.9-dedup": {
    "size": 6.9,
    "dataset": "",
    "family": "pythia"
  }
}

JSON Subset Evaluator

Testing if the previous output matches the JSON: { "Mistral-7B-v0.1": { "size": 7, "dataset": "", "family": "Mistral" }, "RedPajama-INCITE-7B-Base": { "size": 7, "dataset": "", "family": "RedPajama" }, "RedPajama-INCITE-Base-3B-v1": { "size": 3, "dataset": "", "family": "RedPajama" }, "falcon40b": { "size": 40, "dataset": "", "family": "falcon" }, "falcon7b": { "size": 7, "dataset": "", "family": "falcon" }, "gpt2-xl": { "size": 1.5, "dataset": "", "family": "gpt2" }, "llama-65b": { "size": 65, "dataset": "", "family": "llama" }, "llama-7b": { "size": 7, "dataset": "", "family": "llama" }, "neo-1.3": { "size": 1.3, "dataset": "", "family": "neo" }, "neo-2.7": { "size": 2.7, "dataset": "", "family": "neo" }, "neo-6": { "size": 6, "dataset": "", "family": "neo" }, "open_llama_3b_v2": { "size": 3, "dataset": "", "family": "open_llama" }, "open_llama_7b_v2": { "size": 7, "dataset": "", "family": "open_llama" }, "opt-1.3b": { "size": 1.3, "dataset": "", "family": "opt" }, "opt-6.7b": { "size": 6.7, "dataset": "", "family": "opt" }, "pythia-1.4": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-1.4-dedup": { "size": 1.4, "dataset": "", "family": "pythia" }, "pythia-6.9": { "size": 6.9, "dataset": "", "family": "pythia" }, "pythia-6.9-dedup": { "size": 6.9, "dataset": "", "family": "pythia" } }: False