Upload dataset from Kaggle
Upload any dataset - Images, Videos, Audios or Text
Make sure that you are using the SSH/Terminal from NimbleBox to run the following commands.
How to read this document effectively?
This is an extensive document. It takes 2-5 mins to implement the entire process. Skip the steps already done. Use troubleshooting methods incase you encounter errors. See gifs for UI clarity. Copy Paste the commands.

Connect with Kaggle

Generate the API token from
    1.
    My Account
    2.
    API
    3.
    Click “Create New API Token”.
    4.
    A Kaggle.json will be downloaded with the username and token key.
    5.
    Upload Kaggle.json to NimbleBox
Note: Make sure you click on 'Expire API token' before creating new API token.
https://miro.medium.com/max/1400/1*dUdpTZgjxc45oYgEjISESA.png

Installing Kaggle

You might require to install kaggle in NimbleBox before downloading the datasets.
Execute them in Terminal (from the dedicated Terminal or terminal from VSCode or Jupyter)
Make sure one has activated CUDA environment.
1
conda activate cuda <version>
2
Eg: conda activate cuda101
Copied!
Use mamba to install libraries
1
mamba install -c conda-forge kaggle
Copied!

Download datasets using Kaggle’s public API

Step 1 : Create a directory named Kaggle.
1
mkdir ~/.kaggle
Copied!
🔧Troubleshooting: You might encounter 'kaggle command not found' error when running commands on jupyter notebook or kaggle is not installed in the first place.
Step 2 : Make sure Kaggle.json file is present inside the Kaggle directory
1
cp <source-file> <destination-file>
Copied!
1
example: cp /mnt/disks/user/project/kaggle.json ~/.kaggle
Copied!
Double check: Make sure the path mentioned is correct inside NimbleBox
Step 3: Provide the required access permission using the following command
1
chmod 600 ~/.kaggle/kaggle.json
Copied!
Step 4: Connect to our proxy
1
kaggle config set -n proxy -v $HTTPS_PROXY
Copied!
Step 5: Use download command from Kaggle to upload the required dataset.
1
kaggle datasets download -d <dataset-location>
Copied!
1
example: kaggle datasets download -d alxmamaev/flowers-recognition
Copied!
🔧Trouble shooting: '401 unauthorised' Issue occurs when kaggle.json isn't the latest (API token has expired). Generate a new kaggle.json file and upload to NimbleBox
Kaggle downloads the folder in the zip format.

Steps to unzip the folder

1
sudo apt update
Copied!
Download zip and unzip command
1
sudo apt install p7zip-full unzip zip
Copied!
Use this command to unzip
1
7za x <filename>.zip
Copied!
1
Example: 7za x flowers-recognition.zip
Copied!
Last modified 16d ago