Skip to content

Change of Link to Substation Dataset #2738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rijuld opened this issue Apr 17, 2025 · 2 comments
Open

Change of Link to Substation Dataset #2738

rijuld opened this issue Apr 17, 2025 · 2 comments
Labels
datasets Geospatial or benchmark datasets

Comments

@rijuld
Copy link
Contributor

rijuld commented Apr 17, 2025

Hi, @adamjstewart I want to change the link to the substation dataset to Hugging Face for the next release. But I am getting the following error while doing so. This is because the dataset is big and I have to divide it into parts, can I implement my custom function to handle this and a external library, or is there an alternative:

File /ext3/miniforge3/lib/python3.12/zipfile/init.py:257, in _EndRecData64(fpin, offset, endrec)
254 return endrec
256 if diskno != 0 or disks > 1:
--> 257 raise BadZipFile("zipfiles that span multiple disks are not supported")
259 # Assume no 'zip64 extensible data'
260 fpin.seek(offset - sizeEndCentDir64Locator - sizeEndCentDir64, 2)

BadZipFile: zipfiles that span multiple disks are not supported
File /ext3/miniforge3/lib/python3.12/zipfile/init.py:257, in _EndRecData64(fpin, offset, endrec)
254 return endrec
256 if diskno != 0 or disks > 1:
--> 257 raise BadZipFile("zipfiles that span multiple disks are not supported")
259 # Assume no 'zip64 extensible data'
260 fpin.seek(offset - sizeEndCentDir64Locator - sizeEndCentDir64, 2)

BadZipFile: zipfiles that span multiple disks are not supported

@adamjstewart
Copy link
Collaborator

What's the total size of the .tar.gz? HF can host individual files up to 50 GB each. For larger files, you can split it up, but you'll need to merge it again before extraction. The SSL4EO-L dataset is a good example of this.

@adamjstewart adamjstewart added the datasets Geospatial or benchmark datasets label Apr 17, 2025
@rijuld
Copy link
Contributor Author

rijuld commented Apr 17, 2025

It's around 70 GB. Okay, I will use SSL4EO-L as an example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets
Projects
None yet
Development

No branches or pull requests

2 participants