Optimization for reading *.d files

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimization for reading *.d files

brenorg
Hi everybody,

I'm currently refactoring the build system at my company. We got around 20k C++ source files, and we use a form of (messed up) recursive make. After a lot of tinkering to get things right I got to 10 (maybe 5 with some more massage) seconds for a clean build. I need something around 1 second.

Anyway, I want to continue to use GNU Make, and not fallback to CMake/Ninja. After some profiling, what's killing me is parsing the "*.d" files generated by the compiler.

The time to include all dependency files of my project in one single makefile (as I want to get rid of recursive make), is 4 seconds.

I decided to get my hands dirty. I cloned GNU Make 4.0 and started analyzing the parser. Of course, the parser is complex. But those dependency files are so so simple! I want to leverage that information.

I added a detection so the parser knows it's parsing a ".d" file, and switches to my own "eval_d_file" function.

My function doesn't try to find special qualifiers, nor expand variables or anything like that. It parsers as if the only allowed syntax is what exists on what is generated by the compiler.

I got a speed-up. My original 4 seconds went to 2 seconds. Which awesome. Of course, I probably did a buggy implementation (since I don't understand everything going on the code), but it seemed to work.

There are lots of dependency files and they can be processed in parallel, before being merged into the database.

For that, GNU make could need an extension on the include directive to handle "include *.d" differently as it knows dependency files won't alter/create variables but just add dependencies.

I really don't have the guts to use my hackish make version in production, but I'm up to develop this functionality, if you agree it's a good proposal.

Thanks,
Breno G.

Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

Paul Smith-20
Before you go too far with performance improvements you should really
move to the latest version of GNU make (4.2.1) or even try the current
Git HEAD (but you'll need to install autotools etc. to build from Git).

It's less useful to be doing performance testing with a version of make
that old.


On Sat, 2017-03-18 at 19:25 -0700, brenorg wrote:
> There are lots of dependency files and they can be processed in parallel,
> before being merged into the database.

Well, make is not multithreaded so you can't process files in parallel.
I suppose that for slower disks it could be that some kind of
asynchronous file reading which would allow data to be retrieved from
the disk while make works on previously-retrieved data could be useful
but I'm not aware of any async file IO which is anywhere close to
portable.  Also, with SSD more common these days file IO latency is
nowhere near what it used to be, and decreasing all the time.

Someone would have to prove the extra complexity was gaining a
significant amount of performance before I would be interested.

> For that, GNU make could need an extension on the include directive to
> handle "include *.d" differently as it knows dependency files won't
> alter/create variables but just add dependencies.

I'm certainly not willing to do something like declare all included
files ending with .d should be treated as dependency files: people might
use .d files for other things, and they might create dependency files
with different names.  ".d" is just a convention and not a strong one at
that IMO.

However, it could be that the user would declare in her makefile that
all included files matching a given pattern be treated as simple files
(for example).  That would be acceptable, again if the gains were
significant enough.

I'm not convinced that it's impossible to speed up the parser in
general, though.  It shouldn't take twice as long to parse a simple line
using the full scanner, as it does with a targeted scanner.  After all,
you're not making use of any of the special features of parsing so those
should cost you very little.

I'd prefer to investigate improving the existing parser, rather than
create a completely separate parser path.

_______________________________________________
Bug-make mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/bug-make
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

brenorg
Paul Smith-20 wrote
Before you go too far with performance improvements you should really
move to the latest version of GNU make (4.2.1) or even try the current
Git HEAD (but you'll need to install autotools etc. to build from Git).

It's less useful to be doing performance testing with a version of make
that old.
You are right. I'll do that and get back with the results.

Paul Smith-20 wrote
On Sat, 2017-03-18 at 19:25 -0700, brenorg wrote:
> There are lots of dependency files and they can be processed in parallel,
> before being merged into the database.

Well, make is not multithreaded so you can't process files in parallel.
I suppose that for slower disks it could be that some kind of
asynchronous file reading which would allow data to be retrieved from
the disk while make works on previously-retrieved data could be useful
but I'm not aware of any async file IO which is anywhere close to
portable.  Also, with SSD more common these days file IO latency is
nowhere near what it used to be, and decreasing all the time.

Someone would have to prove the extra complexity was gaining a
significant amount of performance before I would be interested.
So I suppose making it multi-threaded is out of question right? :) I didn't think of that.

Even if there are not portable operation, it should be possible to ifdef the code and make it work at least for some systems. Not great, but I imagine it would work.

And I don't think SSD is so common. And even it it is there are still tons of people working on NFS, and other stuff that adds latency - specially for large projects.

Paul Smith-20 wrote
> For that, GNU make could need an extension on the include directive to
> handle "include *.d" differently as it knows dependency files won't
> alter/create variables but just add dependencies.

I'm certainly not willing to do something like declare all included
files ending with .d should be treated as dependency files: people might
use .d files for other things, and they might create dependency files
with different names.  ".d" is just a convention and not a strong one at
that IMO.

However, it could be that the user would declare in her makefile that
all included files matching a given pattern be treated as simple files
(for example).  That would be acceptable, again if the gains were
significant enough.
Yes, sorry I didn't make that clear. The user must be very much aware of what is being done to enable such capability.
Paul Smith-20 wrote
I'm not convinced that it's impossible to speed up the parser in
general, though.  It shouldn't take twice as long to parse a simple line
using the full scanner, as it does with a targeted scanner.  After all,
you're not making use of any of the special features of parsing so those
should cost you very little.

I'd prefer to investigate improving the existing parser, rather than
create a completely separate parser path.
I could agree if the difference were 10x or more. But I believe 2x a reasonable gain from removing so many features. From what I looked on the code, the main hassle comes from the target specific variable assignment.

Having two parser paths should not be so bad as it sounds, since the simplified parser is much much simpler. I needed no more than 20 lines to do it (could be buggy of course...but you get the point).


So to sum up:
0 - I will get back with results for a newer version.
1 - How crazy it would be to make it multi-threaded?
2- This should be configurable with a very strong disclaimer. The alternative scanner wouldn't do any sanity check, so it could be dangerous.
3 - Other option could involve creating a separate tool to collect a bunch of "simple files" and pre-process them into a compact database. That resulting file could then be read into the makefile. By doing that, Make would have to understand this internal compact database format. Still, it would probably need a lot code, even more than the simple scanner.


Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

Norbert Thiebaud
In reply to this post by brenorg
On Sat, Mar 18, 2017 at 9:25 PM, brenorg <[hidden email]> wrote:
>
> Anyway, I want to continue to use GNU Make, and not fallback to CMake/Ninja.
> After some profiling, what's killing me is parsing the "*.d" files generated
> by the compiler.
>
> The time to include all dependency files of my project in one single
> makefile (as I want to get rid of recursive make), is 4 seconds.
>

have you looked in how much redundancy you have in all these dep ?
For LibreOffice, which use one big make to build it all  (60k files or so)
we wrote a step to combine and de-duplicate all these .d file
reducing the amount that need to be parsed by make by quite a bit

iow instead of getting a faster process, reduce the amount to be processed.

Norbert

_______________________________________________
Bug-make mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/bug-make
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

brenorg
Hi Norbert,

You are absolutely right. There is much more redundancy than I expected. I joined all .d files in one single file and after running make on it and printing the database, it's actually 10x smaller. And I know there must be more duplication since my files names are not all relative to the same root yet.

I'll continue this path and see if I can get acceptable performance. If not, I will insist a little bit more on this thread. The build system refactoring will also allow pluging other builders like Ninja, so I can put this problem aside for now.

Thanks!
Breno G.
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

Paul Smith-20
In reply to this post by brenorg
On Sat, 2017-03-18 at 22:49 -0700, brenorg wrote:
> > I'd prefer to investigate improving the existing parser, rather than
> > create a completely separate parser path.
>
> I could agree if the difference were 10x or more. But I believe 2x a
> reasonable gain from removing so many features. From what I looked on the
> code, the main hassle comes from the target specific variable assignment.

The thing is, the parser has to walk through all the characters on each
line at least once.  It can know, after that walk, whether there's
anything special about a given word or complete line.  There's no reason
the parser should have to do a lot of extra work, if it already knows
that there isn't anything interesting here to work on.  This is kind of
the same idea as the existing shell invocation fast-path: if the command
line in the recipe is simple enough then we don't have to invoke a
shell: we just invoke the command directly.  We tell whether it's simple
enough by merely looking for certain special characters in the command;
if we see any of them we automatically fall through to the slow path
(full shell invocation).

Maybe that information isn't being provided where it's needed, to allow
short-cuts to be taken in the parser.  Or maybe there's some complexity
here that I haven't thought of that makes this impossible.

But I don't see why even a 2x slowdown should be expected by the full
parser.

> So to sum up:
> 0 - I will get back with results for a newer version.
> 1 - How crazy it would be to make it multi-threaded?

The question we have to ask is what is the upside.  If your disk is
slow, then you're waiting for your disk: a HD has only one head so you
can only read one file at a time from the disk no matter how many
threads you have.  Waiting for MORE disk IO isn't going to speed things
up appreciably, if the time spent actually parsing files is small
compared to the wait time for more content to parse.

If the parse time is more equal to the disk IO time, then you might get
some benefit from having some amount of lookahead, either by async IO or
one extra thread.

The question is do you REALLY get performance gains for this added
complexity?  I'm not convinced it's a no-brainer.  I'd need to see some
analysis showing exactly where the time is going during the parsing.

> 2- This should be configurable with a very strong disclaimer. The
> alternative scanner wouldn't do any sanity check, so it could be dangerous.
> 3 - Other option could involve creating a separate tool to collect a bunch
> of "simple files" and pre-process them into a compact database. That
> resulting file could then be read into the makefile. By doing that, Make
> would have to understand this internal compact database format. Still, it
> would probably need a lot code, even more than the simple scanner.

It's quite possible something like this could be done via an extension,
either Guile or a shared library, that maintained a database.  To make
it really efficient we'd need a new API that allowed extensions to
define new rules, or at least prerequisite definitions, but even without
that condensing the values to a single instance (as you've discovered)
could be helpful.

I mean something like, defining a new function that would parse a .d
file and add content into some kind of database.  If you use Guile you
could use guile-dbi with sqlite, or just roll your own.  This function
would be run as part of the recipe, after the compiler generates the .d
file (something tricky would need to be done here because normally
functions are expanded before the recipe is invoked).

Then in addition a new function would be defined that extracted content
from the database and fed it to make to define the rules.  Something
like:

  DATABASE = makedeps.sqlite
  $(guile (create-rules $(DATABASE)))

  %.o : %.c
          $(CC) ... # create .d file
          $(guile (parse-deps $(DATABASE) $*.d))

as I said this won't work directly because the guile function would be
expanded before the compiler is run; something would have to happen
there.

_______________________________________________
Bug-make mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/bug-make
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

Norbert Thiebaud
In reply to this post by brenorg
On Sun, Mar 19, 2017 at 11:54 AM, brenorg <[hidden email]> wrote:
> Hi Norbert,
>
> You are absolutely right. There is much more redundancy than I expected. I
> joined all .d files in one single file and after running make on it and
> printing the database, it's actually 10x smaller. And I know there must be
> more duplication since my files names are not all relative to the same root
> yet.
>
> I'll continue this path and see if I can get acceptable performance

take a look at
https://cgit.freedesktop.org/libreoffice/core/tree/solenv/bin/concat-deps.c

it is specific to our project.. but still the general gist of it is there.

Norbert

_______________________________________________
Bug-make mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/bug-make
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

brenorg
Wow! I appreciate it :)
Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

brenorg
In reply to this post by Paul Smith-20
Paul Smith-20 wrote
On Sat, 2017-03-18 at 22:49 -0700, brenorg wrote:
> > I'd prefer to investigate improving the existing parser, rather than
> > create a completely separate parser path.
>
> I could agree if the difference were 10x or more. But I believe 2x a
> reasonable gain from removing so many features. From what I looked on the
> code, the main hassle comes from the target specific variable assignment.

The thing is, the parser has to walk through all the characters on each
line at least once.  It can know, after that walk, whether there's
anything special about a given word or complete line.  There's no reason
the parser should have to do a lot of extra work, if it already knows
that there isn't anything interesting here to work on.
Yes, that was the hope I had before seeing the code. Unfortunately, the code is not well structured enough to make this optimization simple to implement. That's why I followed the "simpler scanner" path.

Paul Smith-20 wrote
> So to sum up:
> 0 - I will get back with results for a newer version.
> 1 - How crazy it would be to make it multi-threaded?

The question we have to ask is what is the upside.  If your disk is
slow, then you're waiting for your disk: a HD has only one head so you
can only read one file at a time from the disk no matter how many
threads you have.  Waiting for MORE disk IO isn't going to speed things
up appreciably, if the time spent actually parsing files is small
compared to the wait time for more content to parse.

If the parse time is more equal to the disk IO time, then you might get
some benefit from having some amount of lookahead, either by async IO or
one extra thread.

The question is do you REALLY get performance gains for this added
complexity?  I'm not convinced it's a no-brainer.  I'd need to see some
analysis showing exactly where the time is going during the parsing.
I don't think disk plays much into this. If the SO file cache is hot, most time should be spent on the parser - and that is what I see.
I ran perf on the actual code parsing a large number of files, and 80% of the time goes to eval_makefile/eval.


Paul Smith-20 wrote
> 2- This should be configurable with a very strong disclaimer. The
> alternative scanner wouldn't do any sanity check, so it could be dangerous.
> 3 - Other option could involve creating a separate tool to collect a bunch
> of "simple files" and pre-process them into a compact database. That
> resulting file could then be read into the makefile. By doing that, Make
> would have to understand this internal compact database format. Still, it
> would probably need a lot code, even more than the simple scanner.

It's quite possible something like this could be done via an extension,
either Guile or a shared library, that maintained a database.  To make
it really efficient we'd need a new API that allowed extensions to
define new rules, or at least prerequisite definitions, but even without
that condensing the values to a single instance (as you've discovered)
could be helpful.

I mean something like, defining a new function that would parse a .d
file and add content into some kind of database.
I love the idea. A generic callback API would be nice and easy to support.
I don't know much about Guile. I will take a look at that.

Next steps are to see how far the "condensing" values takes me and get back in here if I think we can do better.

Reply | Threaded
Open this post in threaded view
|

Re: Optimization for reading *.d files

Michael Stahl-6
In reply to this post by Norbert Thiebaud
On 19.03.2017 09:54, Norbert Thiebaud wrote:

> On Sat, Mar 18, 2017 at 9:25 PM, brenorg <[hidden email]> wrote:
>>
>> Anyway, I want to continue to use GNU Make, and not fallback to CMake/Ninja.
>> After some profiling, what's killing me is parsing the "*.d" files generated
>> by the compiler.
>>
>> The time to include all dependency files of my project in one single
>> makefile (as I want to get rid of recursive make), is 4 seconds.
>>
>
> have you looked in how much redundancy you have in all these dep ?
> For LibreOffice, which use one big make to build it all  (60k files or so)
> we wrote a step to combine and de-duplicate all these .d file
> reducing the amount that need to be parsed by make by quite a bit
>
> iow instead of getting a faster process, reduce the amount to be processed.

that can still be improved: Bjoern implemented something quite similar
to what Breno wants, an "includedepcache" keyword and
optimized/restricted dep file format, as previously discussed on this list:

https://mid.mail-archive.com/bug-make@.../msg08784.html

but it's probably not used by anybody currently since we don't want to
require a feature that's not in upstream GNU make.

the implementation is somewhere in here:

https://gerrit.libreoffice.org/gitweb?p=gnu-make-lo.git;a=shortlog;h=refs/heads/gnu-make-lo-4.0


_______________________________________________
Bug-make mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/bug-make