Page MenuHomePhabricator

Metadata issues with few .mpg files on Wikimedia Commons
Open, LowPublicBUG REPORT

Description

https://commons.wikimedia.org/wiki/File:Cometa_Leonard_(C2021_A1)_en_M3.mpg

ffprobe output:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'Cometa_Leonard_(C2021_A1)_en_M3.mpg':
  Metadata:
    major_brand     : mp4v
    minor_version   : 0
    compatible_brands: mp4vmp42isom
  Duration: 00:00:11.04, start: 0.000000, bitrate: 856 kb/s
    Stream #0:0(und): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 640x640, 855 kb/s, 6.25 fps, 100 tbr, 600 tbn, 1200 tbc (default)

Metadata is missing width and height.

https://commons.wikimedia.org/wiki/File:Test_conductitivity.mpg

ffprobe output:

Input #0, mpeg, from 'Test_conductitivity.mpg':
  Duration: 00:00:20.37, start: 0.529089, bitrate: 742 kb/s
    Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, bt709, progressive), 636x360, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x1c0]: Audio: mp2, 44100 Hz, stereo, s16p, 128 kb/s

Metadata is missing width.

https://commons.wikimedia.org/wiki/File:%E0%A4%9A%E0%A4%BE%E0%A4%9A%E0%A5%8B%E0%A4%B0%E0%A4%A8%E0%A5%80_%E0%A4%AE%E0%A5%87%E0%A4%82_%E0%A4%A4%E0%A5%87%E0%A4%9C%E0%A4%BE%E0%A4%9C%E0%A5%80_%E0%A4%95%E0%A4%BE_%E0%A4%9D%E0%A4%82%E0%A4%A1%E0%A4%BE_%E0%A4%AB%E0%A4%B9%E0%A4%B0%E0%A4%BE%E0%A4%A4%E0%A5%87_%E0%A4%B9%E0%A5%81%E0%A4%8F.mpg

ffprobe output:

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'चाचोरनी_में_तेजाजी_का_झंडा_फहराते_हुए.mpg':
  Metadata:
    major_brand     : 3gp4
    minor_version   : 512
    compatible_brands: 3gp43gp53g2a
    creation_time   : 2020-08-29T02:59:10.000000Z
  Duration: 00:01:13.81, start: 0.000000, bitrate: 679 kb/s
    Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, smpte170m/bt470bg/smpte170m), 360x640, 499 kb/s, 22 fps, 22 tbr, 1k tbn, 2k tbc (default)
    Metadata:
      creation_time   : 2020-08-29T02:59:10.000000Z
      handler_name    : vide
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 131 kb/s (default)
    Metadata:
      creation_time   : 2020-08-29T02:59:10.000000Z
      handler_name    : soun

Metadata is missing width and height and duration.

https://commons.wikimedia.org/wiki/File:Disney_Channel_Taiwan_Closed.mpeg

ffprobe output:

Input #0, mpeg, from 'Disney_Channel_Taiwan_Closed.mpeg':
  Duration: 00:01:10.12, start: 0.500000, bitrate: 6253 kb/s
    Stream #0:0[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 128 kb/s
    Stream #0:1[0x1c1]: Audio: mp2, 48000 Hz, stereo, s16p, 128 kb/s
    Stream #0:2[0x1e2]: Video: h264 (High), yuv420p(top first), 1920x1080 [SAR 1:1 DAR 16:9], 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc

Metadata is missing width and height and duration.

Related Objects

Event Timeline

@Mitar: If these files don't provide such info, what is expected in this task? (Basically: Why was this made a subtask of task about software to support file formats?)

Files do provide this info, see output of ffprobe (there is both duration and width and height in there). But this is not detected correctly by Mediawiki software. So it seems support for mpg files is not complete and some are not handled correctly. So this task is about supporting those files, too.

Some digging:

https://commons.wikimedia.org/wiki/File:Cometa_Leonard_(C2021_A1)_en_M3.mpg
has the wrong file extension. It is actually a .mp4 file (and we do not support MPEG4 files)

https://commons.wikimedia.org/wiki/File:Test_conductitivity.mpg
is an MPEG PS file, with MPEG 1/2 video (supported). It has an pixel_aspect_ratio of 0. I do not remember if that is allowed in mpeg PS, but we can probably test for that before we do $width = (int)( $width * $aspect ); and fix it.

https://commons.wikimedia.org/wiki/File:चाचोरनी_में_तेजाजी_का_झंडा_फहराते_हुए.mpg
has the wrong file extension. It is a .3gp MPEG4 video file, not an MPEG1/2 stream and is not supported.

https://commons.wikimedia.org/wiki/File:Disney_Channel_Taiwan_Closed.mpeg
This is an MPEG PS stream, with h264 video. It seems that getID3 does not support this, which isn't too surprising, as you'd have to decode the h264 and parse the entire stream before you can know the metadata. This could be a bleary rip or something. BTW this file is pretty much guaranteed to be a copyright violation, being a straight rip of Disney channel content.

So should we delete all except the Test_conductitivity.mpg files? Or should I re-code the first and third file as MPG and re-upload them?