The Fashion-MNIST
is a dataset of Zalando’s article images. It serves as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. This dataset includes 60,000 training examples and 10,000 test examples, each being a 28x28 grayscale image associated with a label from 10 classes.
- Description of
Fashion-MNIST
- Where to Download the Fashion-MNIST Dataset
- Fashion MNIST in CSV
- Usage and Benchmarks
- Conclusion
Description of Fashion-MNIST
The classes in the Fashion-MNIST
dataset represent different types of clothing and fashion items. These include:
Label | Description |
---|---|
0 | T-shirt/top |
1 | Trouser |
2 | Pullover |
3 | Dress |
4 | Coat |
5 | Sandal |
6 | Shirt |
7 | Sneaker |
8 | Bag |
9 | Ankle boot |
Here’s an example of how the data looks
Fashion-MNIST
was created with the intention of providing a more challenging benchmark dataset for machine learning and computer vision algorithms, as the original MNIST dataset (comprising hand-written digits) was considered too easy. The dataset aims to reflect real-world scenarios better and provide a more difficult challenge in image classification tasks.
Where to Download the Fashion-MNIST Dataset
- Access the MNIST dataset at its origin https://github.com/zalandoresearch/fashion-mnist/tree/master/data/fashion
- For those who prefer working with tabular data, a CSV version of the Fashion MNIST dataset is available here dataset-MNIST-fashion.zip
Fashion MNIST in CSV
Here’s the zip file dataset-MNIST-fashion.zip
inside it there are two csv files
mnist_fashion_train.csv
(60,000 images)mnist_fashion_test.csv
(10,000 images)
The format is:
label, 1x1, 1x2, 1x3, … 28x27, 28x28
where i
xj
is the pixel in the i
-th row and j
-th column.
the label is a number between 0 and 9
Here is the converter
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
public class MNISTConverter {
public static void convert(String imgFileName, String labelFileName, String outFileName, int n) throws IOException {
try (FileInputStream f = new FileInputStream(imgFileName);
FileWriter o = new FileWriter(outFileName);
FileInputStream l = new FileInputStream(labelFileName)) {
final StringBuilder firstLine = new StringBuilder("label,");
for (int i = 1; i < 29; i++) {
for (int j = 1; j < 29; j++) {
firstLine.append(i).append("x").append(j);
if (i != 28 || j != 28) {
firstLine.append(",");
}
}
}
o.write(firstLine.toString() + "\n");
byte[] bytes = new byte[16];
f.read(bytes, 0, 16); // skip the first 16 bytes
l.read(bytes, 0, 8); // skip the first 8 bytes
for (int i = 0; i < n; i++) {
StringBuilder image = new StringBuilder();
image.append(l.read() & 0xFF); // read label
for (int j = 0; j < 28 * 28; j++) {
image.append(",").append(f.read() & 0xFF); // read image bytes
}
o.write(image.toString() + "\n");
}
}
}
public static void main(String[] args) throws IOException {
convert("train-images-idx3-ubyte", "train-labels-idx1-ubyte", "mnist_fashion_train.csv", 60000);
convert("t10k-images-idx3-ubyte", "t10k-labels-idx1-ubyte", "mnist_fashion_test.csv", 10000);
}
}
if you have the uncompressed files (train-images-idx3-ubyte
, train-labels-idx1-ubyte
, t10k-images-idx3-ubyte
, t10k-labels-idx1-ubyte
) in the same directory as the script, you could execute it with jbang
jbang MNISTConverter.java
and the mnist_fashion_train.csv
and mnist_fashion_test.csv
would be generated.
The conversion is inspired by the one from Joseph Redmon https://pjreddie.com/projects/mnist-in-csv/
Usage and Benchmarks
Here is a link to a nice comparison of different implementations agains MNIST
and Fashion-MNIST
: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/
a nice article by the Fashion-MNIST creator : https://hanxiao.io/2018/09/28/Fashion-MNIST-Year-In-Review/
Conclusion
In conclusion, the Fashion-MNIST
dataset serves as an invaluable resource for both beginners and seasoned practitioners in the field of machine learning. Offering a more complex and realistic challenge than its predecessor, the original MNIST
, it pushes the boundaries of image classification techniques. Whether you are conducting academic research, developing commercial applications, or just exploring the fascinating world of machine learning, Fashion-MNIST
in its CSV form provides a perfect starting point for your endeavors.
Happy coding!