More

Breakthrough · 2025-11-12T00:35:07 1762907707

Awesome to see this! I actually wrote PySceneDetect, was great to see it getting some use here. Would you be willing to share what parameters you were using? I'm curious why the accuracy was so low.

PySceneDetect only uses basic heuristic methods right now so it does require some degree of tuning to get things working for certain data sets. Your post inspired me to look into maybe integrating TransNetV2 as a detector in the future!

icyfox · 2025-11-12T01:24:26 1762910666

Nice to see you on here! I used the ContentDetector with a threshold of 27.0 and otherwise default parameters. Realize I could have done a grid sweep to really hone in on a good param range, but because I had only one input video labeled I wanted something that would work well enough out of the box. I imagine this dataset is rather... heterogenous.

If you happen to know a better apriori threshold I would be happy to re-run the analysis and update the chart.

Breakthrough · 2025-11-12T03:28:48 1762918128

If you're willing, could you try using AdaptiveDetector? It should have better defaults for handling fast camera movements a bit better.

The threshold values themselves can be tuned if you generate a statsfile and plot the result, but that can sometimes be tedious if you have a lot of files (thus the huge interest in methods like TransNetV2). Glad to see the real world applications of those in action. You can always just increase/decrease the threshold by 5-10% depending on if you find it's too sensitive or not sensitive enough as well.

Thanks for the response!

icyfox · 2025-11-12T04:03:10 1762920190

AdaptiveDetector definitely did a better job, will append these new stats to the post:

precision 0.397, recall 0.727, F1 0.513, mean temporal error 0.307 s

Breakthrough · on Sept 5, 2022

Are there keyboard controls? Someone I was just playing with seemed to always lock on each "note" or "half note" exactly, wasn't sure how they were doing that.

Breakthrough · on Sept 2, 2019

Thanks for giving PySceneDetect a try. Can you share what you're using as the command line arguments?

For detecting fades-to-black, you want to make sure that you're using the detect-threshold command (not detect-content). For example:

    scenedetect -i somevideo.mp4 detect-threshold -t 12 list-scenes

Where -t specifies the threshold to use (the default being a value of 12). Full documentation for the detect-threshold command:

https://pyscenedetect-manual.readthedocs.io/en/latest/cli/de...

Breakthrough · on Sept 1, 2019

Author of PySceneDetect here: Thank you all for the thought-provoking discussions, and the attention you've given my side project. There are some specific cases where PySceneDetect achieves great accuracy (like fast cuts or fades), and some where it's currently not so good at (like sudden flashes or large obstructions). That being said I do want to track these things and come up with solutions to improve the robustness of the content detection algorithm over time.

I'm most open to any feedback or feature requests/ideas/suggestions; feel free to checkout the issue tracker on Github, or create a new entry:

https://github.com/Breakthrough/PySceneDetect/issues

Some ideas being considered/researched for future releases:

- looking at changes to image histograms

- using edge detection to improve robustness

- camera flash / foreground object suppression

- automatic threshold detection using statistical methods (currently is just a heuristic)

bb88 · on Sept 1, 2019

So this is likely beyond the scope of your project, but I've always thought a really good project would be a website to host scene indexes for movies and TV.

Eg. Let's say that you wanted to watch a prerecorded football game or baseball game without all the commercials, timeouts, commentators talking about the fans, etc.

Or... Let's say that you wanted to re-cut a movie in a certain way, by re-ordering the scenes, you could just generate a new scene data file and let the encoder/player use that.

Breakthrough · on Sept 1, 2019

This is still relevant I think :) What you mentioned is very similar to an edit decision list (EDL [1]), of which I only learned recently. I had a feature request [2] to support EDL as an ouptut format, and upon further investigation, it seems like the format is very similar to what you're talking about. The Wikipedia page also indicates that VLC supports xspf files, but I haven't done much research into that yet ("XML Shareable Playlist Format").

[1] https://en.wikipedia.org/wiki/Edit_decision_list

[2] https://github.com/Breakthrough/PySceneDetect/issues/101

Breakthrough · on Sept 1, 2019

Author of the program here. You are correct, the program detects shots rather than scenes, but I didn't want to give the impression that this project was related to the existing ShotDetect program. I felt that the documentation explained this well enough, but I'm open to considering an alternative project name if anyone has a suggestion.

Breakthrough · on Sept 1, 2019

Hi Hamuko;

Would you be able to share a small sub-set of the episode, in particular the area where you're unable to detect the starting segment? (If not, no worries!)

There's a few issues with PySceneDetect currently that may lead to false or missed detections, but these are things that I would like to solve in the long run:

- threshold is heuristic/fixed right now, but I would like to change it to an adaptive/statistical method which can dynamically change

- single-frame events can trigger false scene changes

Thanks for your feedback, and feel free to share any other suggestions you might have.

Breakthrough · on Sept 1, 2019

Author of PySceneDetect here. The current implementation does exactly what you hint at, except instead of YUV, it considers deltas in the HSV domain (specifically differences in hue and color).

Other techniques being considered for future work include use of optical flow, background subtraction, and analyzing histograms.

spapas82 · on Sept 1, 2019

From what I remember the Y (luma) component in a YUV video has more information than the other two components and it could also be extracted without the need to fully decompress the video (in mpeg compressed videos). Of course this info is more than 10 years old (I don't really do any video research any more) so I guess there should have been progress in that area.

Breakthrough · on Sept 1, 2019

This is indeed correct, I'm just using HSV instead of YUV, but the primary source of information is the luma/brightness component (although currently all 3 of the HSV components are averaged, so perhaps a better weighting may improve precision).

amelius · on Sept 1, 2019

What if the shot is of two people having a conversation in a disco, with lots of lights flashing (stroboscope), etc?

Breakthrough · on Sept 1, 2019

This is definitely a good idea, and something that I'm most open to considering for a future release of PySceneDetect. Admittedly the current version does not handle single-frame "upsets" like this, but this does seem like a logical and reasonable approach to a first attempt at filtering them out.

doubleunplussed · on Sept 1, 2019

I would do an exponential smoothing of pixel values over some timescale, say, 0.2 seconds, before further detection of scene changes. That should do the trick.

Breakthrough · on Sept 1, 2019

Currently, flashes and other single-frame events are not dealt with gracefully by PySceneDetect, although there is an open feature request for that: https://github.com/Breakthrough/PySceneDetect/issues/35

(Disclaimer: I'm the author of PySceneDetect.)

Breakthrough · on Sept 1, 2019

I've seen a proof of concept which combined the output of PySceneDetect with subtitle information and computer vision to allow you to do something like "go to the scene with the big castle" or something similar. I can't remember what it's called off the top of my head, but it seemed like a pretty cool concept.

Disclaimer: I'm the author of PySceneDetect.

samstave · on Sept 2, 2019

God i love HN