DataProvenanceInitiative/common_pile_set
Viewer
•
Updated
•
4.79M
•
44
DataProvenanceInitiative/Megawika_corrected
Viewer
•
Updated
•
556k
•
200
DataProvenanceInitiative/stack-exchange-instruction-2split
Viewer
•
Updated
•
10.8M
•
67
DataProvenanceInitiative/Megawika_subset
Updated
•
76
DataProvenanceInitiative/common_pile_ultra_permissive
Viewer
•
Updated
•
7.05M
•
23
DataProvenanceInitiative/Commercial_or_unspecified_licenses_and_terms
Viewer
•
Updated
•
61M
•
75
DataProvenanceInitiative/commercial_or_unspecified_licenses
Viewer
•
Updated
•
74.6M
•
84
DataProvenanceInitiative/commercial_licenses_and_terms
Viewer
•
Updated
•
25.2M
•
82
DataProvenanceInitiative/commercial_licenses
Viewer
•
Updated
•
35M
•
107
•
2
DataProvenanceInitiative/Everything
Viewer
•
Updated
•
44.5M
•
236
•
1
DataProvenanceInitiative/DPI-Dolma
Viewer
•
Updated
•
9.69M
•
2
•
1
DataProvenanceInitiative/Ultra_Permissive_Test
Preview
•
Updated
•
13
DataProvenanceInitiative/common_pile_subset
Preview
•
Updated
•
1
DataProvenanceInitiative/seacrowd
Viewer
•
Updated
•
332k
•
2
DataProvenanceInitiative/Commercially-Verified-Licenses
Preview
•
Updated
•
60
•
5
DataProvenanceInitiative/t0_submix_original
Viewer
•
Updated
•
1.65M
•
2
•
2
DataProvenanceInitiative/dialog_submix_original
Viewer
•
Updated
•
554k
•
2
DataProvenanceInitiative/niv2_submix_original
Viewer
•
Updated
•
10.1M
•
1
•
3
DataProvenanceInitiative/cot_submix_original
Viewer
•
Updated
•
184k
•
18
•
3
DataProvenanceInitiative/flan2021_submix_original
Viewer
•
Updated
•
5.36M
•
6
DataProvenanceInitiative/Commercial-Flan-Collection-SNI
Viewer
•
Updated
•
1.31M
•
4
DataProvenanceInitiative/Commercial-Flan-Collection-P3
Preview
•
Updated
•
1
DataProvenanceInitiative/Commercial-Flan-Collection-Flan-2021
Viewer
•
Updated
•
445k
•
3
•
1
DataProvenanceInitiative/Commercial-Flan-Collection-Dialog
Viewer
•
Updated
•
554k
•
10
DataProvenanceInitiative/Commercial-Flan-Collection-Chain-Of-Thought
Viewer
•
Updated
•
184k
•
9
•
5