Quantcast

maximum CPU utilization with ffmpeg and libx264

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

maximum CPU utilization with ffmpeg and libx264

George D Pylant III
Background:

Regarding the current Mac Pro Dual Dual Quad Core (2.4 ghz) Westmere Xeon:  My understanding is that these cores support hyperthreading allowing two threads to run simultaneously on each core. So OSX "sees" 16 cores when there are actually only eight physical cores.  iStat Menus or Activity Monitor shows sixteen cores.

For encoding I use the following command:

ffmpeg -y -i video.m2ts -threads 0 -acodec ac3 -ab 448k -ar 48000 -vcodec libx264 -coder ac -level 41 -b 5000k -refs 2 -flags +loop -flags2 +mixed_refs+dct8x8-fastpskip -me_method umh -subq 9 -me_range 16 -qmin 10 -qmax 50 -g 24 -keyint_min 2 -copyts video.mp4"

I notice via iStat Menus or Activity Monitor that only eight cores are being used with threads 0 or threads 24.  The other eight "virtual" cores show very little activity.  With an analogous command using handbrakecli (x86_64) iStat Menus or Activity Monitor shows all sixteen cores being used almost 100% each.

My x264 configure line:  ./configure --prefix=${TARGET}

My ffmpeg configure line:  ./configure --prefix=${TARGET} --enable-nonfree --enable-gpl --enable-version3 --enable-libx264 --enable-pthreads --enable-libfaac --enable-libspeex --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libopencore_amrwb --enable-libopencore_amrnb --enable-libgsm --enable-libopenjpeg --enable-libxvid --enable-libschroedinger --enable-libdirac --enable-libxavs --enable-librtmp --enable-avfilter --enable-filters --enable-postproc --target-os=darwin --arch=x86_64 --enable-runtime-cpudetect

My question(s):

I am trying to get the fastest ffmpeg encodes possible on my Mac Pro.  So why doesn't ffmpeg/libx264 use all "sixteen" cores, i.e. two threads per core like handbrakecli does?  I can use the same ffmpeg command line above on a PC that I have (corei7 quad 2.8 ghz hyperthreading capable) that shows all eight (i.e. two threads per core) being used almost 100% each.  So I know ffmpeg uses two threads per core on my PC but seems to only use one thread per core on my Mac Pro.  Is there a flag or flags I should be using when compiling x264 and/or ffmpeg that would allow maximum CPU usage on the mid 2010 Mac Pro, i.e. use all "sixteen" cores instead of just eight like HandbrakeCLI does?  Or is this a current limitation of ffmpeg/libx264 on the dual quad Westmere Xeon CPU's on a Mac Pro?

Thanks for any input.

George
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: maximum CPU utilization with ffmpeg and libx264

Reindl Harald

Am 22.03.2011 02:41, schrieb George D Pylant III:

> My understanding is that these cores support hyperthreading allowing two threads
> to run simultaneously on each core.

yes and no
since it is not a full core / cpu you can not expect wonders
HT is using on non-ht-cpus wasted parts of the die (to say it simple)

> So why doesn't ffmpeg/libx264 use all "sixteen" cores, i.e. two threads
> per core like handbrakecli does?  -

because you can not scale linear with more cpus and
not every operation can benefit from multithreading

who tells you that the 100% cpu from "handbrakecli" is not wasted
for useless thread-synchronisation, try it and reduce the software
to use only 8 threads, maybe it gets slightly faster than with 16

--

Mit besten Grüßen, Reindl Harald
the lounge interactive design GmbH
A-1060 Vienna, Hofmühlgasse 17
CTO / software-development / cms-solutions
p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40
icq: 154546673, http://www.thelounge.net/


_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

signature.asc (269 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: maximum CPU utilization with ffmpeg and libx264

Robert Johnston
On March-22-11 1:53:41 AM, Reindl Harald wrote:
> Am 22.03.2011 02:41, schrieb George D Pylant III:
>
>> My understanding is that these cores support hyperthreading allowing two threads
>> to run simultaneously on each core.
>
> yes and no
> since it is not a full core / cpu you can not expect wonders
> HT is using on non-ht-cpus wasted parts of the die (to say it simple)

Hyperthreading is an artifact of Intel's decision to lengthen it's
instruction pipeline, rather than increase the speed of execution. They
found with early test Pentium 4's that the long instruction pipe wasn't
getting anywhere near filled, and thus all their hard work was wasted.
Hyperthreading was implemented as a way of stuffing two shorter
instructions into one long pipe. If the instructions you are sending
are already optimised for VLIW processors (as GCC and the ff devs have
tried to) then pumping these "long" instructions into the "virtual" HT
cores will result in slowdown, as both instruct
ions won't fit in the
pipe. This was demonstrated with properly multithreaded tasks back when
HT was first introduced - It was slower to run a Raytracer with HT
enabled, and taking all cores (real and virtual) than it was to disable
HT and run with "half" the number of cores, or restrict the raytracer
to just the "real" cores, when possible. HT is good for multitasking,
as it can stuff in idle tasks into spare space in the pipe when
available, but it's bad for multithreading and true performance as it
will slow down truly multi-core aware applications.

Hope this helps.
_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: maximum CPU utilization with ffmpeg and libx264

Richard Buteau
In reply to this post by Reindl Harald


> -----Original Message-----
> From: [hidden email] [mailto:ffmpeg-user-
> [hidden email]] On Behalf Of Reindl Harald
> Sent: Tuesday, March 22, 2011 2:54 AM
> To: [hidden email]
> Subject: Re: [FFmpeg-user] maximum CPU utilization with ffmpeg and
> libx264
>
>
> Am 22.03.2011 02:41, schrieb George D Pylant III:
>
> > My understanding is that these cores support hyperthreading allowing
> > two threads to run simultaneously on each core.
>
> yes and no
> since it is not a full core / cpu you can not expect wonders HT is
> using on non-ht-cpus wasted parts of the die (to say it simple)
>
> > So why doesn't ffmpeg/libx264 use all "sixteen" cores, i.e. two
> > threads per core like handbrakecli does?  -
>
> because you can not scale linear with more cpus and not every operation
> can benefit from multithreading
>
> who tells you that the 100% cpu from "handbrakecli" is not wasted for
> useless thread-synchronisation, try it and reduce the software to use
> only 8 threads, maybe it gets slightly faster than with 16

You actually get more throughput with X264 if you turn on hyperthreading I ran some benchmark with and without it about a year and half ago on Nehalem based systems. There was up to 15% more throughput with hyper-threading on and using all core (hyper threaded or not). I wasn't actually looking at system load but just the average fps. Maybe these tests should be run again just to make sure that is still true.

>
> --
>
> Mit besten Grüßen, Reindl Harald
> the lounge interactive design GmbH
> A-1060 Vienna, Hofmühlgasse 17
> CTO / software-development / cms-solutions
> p: +43 (1) 595 3999 33, m: +43 (676) 40 221 40
> icq: 154546673, http://www.thelounge.net/

_______________________________________________
ffmpeg-user mailing list
[hidden email]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user
Loading...