August 18th, 2024

Trailer Faces HQ Dataset

The Trailer Faces HQ Dataset includes 186,553 high-resolution face images from 15,379 movie trailers, addressing diversity in facial expressions and available under a Creative Commons BY-SA license.

Read original articleLink Icon
Trailer Faces HQ Dataset

The Trailer Faces HQ Dataset, created by Justin Pinkney, consists of 186,553 high-resolution face images sourced from movie trailers. This dataset was developed to address the limitations of existing datasets, such as FFHQ, which lacked diversity in facial expressions. Pinkney collected images from 15,379 trailers available on the Apple Movie Trailers website, resulting in approximately 2 TB of video data. The face detection process utilized a pre-trained Yolov5-face model, and a deduplication method was implemented to ensure only the sharpest images were retained. The final dataset underwent quality filtering to exclude undesirable images, and face identity information was extracted using a model from InsightFace_Pytorch. This allows users to access images of the same individual across different poses and settings. The dataset is available under a Creative Commons BY-SA license, encouraging users to share their applications of the data.

- The dataset contains 186,553 high-resolution face images from movie trailers.

- It was created to provide a diverse range of facial expressions, unlike previous datasets.

- Images were sourced from 15,379 trailers, totaling around 2 TB of video.

- A pre-trained Yolov5-face model was used for face detection and deduplication.

- The dataset is released under a Creative Commons BY-SA license for public use.

Link Icon 2 comments
By @Mistletoe - 7 months
An interesting dataset as I assume most actors are more attractive than the general population as well.

https://psych-neuro.com/2016/04/04/hollywood-beauty-does-fam...

>When Schmid compared these aspects of celebrity faces to those of non-celebrities, she found that the average attractiveness score out of 10 for non-celebrities was between 4-5 while celebrities tend to score at a minimum of 6. According to her algorithm, celebrities are significantly more attractive than non-celebrities with top scorers including Brad Pitt, George Clooney, Kate Upton, and Miley Cyrus.

By @notachatbot1234 - 7 months
> The dataset is released under the Creative Commons BY-SA

How can this be legal? All imagery is taken from (usually) non-free movie trailers.