Discussion:
On Phobos GC hunt
(too old to reply)
Dmitry Olshansky via Digitalmars-d
2014-10-07 15:57:58 UTC
Permalink
I made a proposal to quantatively measure and tabulate all GC
allocations in Phobos before coming up with solutions to "@nogc
Phobos".

After approving node from Andrei I've come up with a piece of
automation to extract this data and post it on wiki.

So here is the exhustive list of everything calling into GC in
Phobos (-vgc compiler flag):

http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage

Including source links, a wild guess at function's name and the
compiler's warning message for potential GC call.

As far as data goes this is about as good as we can get, the next
phase is labeling this stuff with potential solution(s). Again
doing all by hand is tedious and hardly useful.

Instead we need to observe patterns and label it automatically
until the non-trivial subset remains. So everybody, please take
time and identify simple patterns and post back your ideas on
solution(s).

So far I see the most frequent cases:
- `new SomeException` - switch to RC exceptions
- AA access - ??? (use user-defined AA type as parameter?)
- array concat - ???
- closure - ???



---
Dmitry Olshansky
grm via Digitalmars-d
2014-10-07 16:23:18 UTC
Permalink
1.) It may be helpful to reduce the noise in that every match
after a new is ignored (and probaly multiple 'operator ~' alarms
within the same statement).

2.) There seems to be a problem with repeated alarms:
When viewing the page source, this link shows up numerous times.
See
https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688

/Gerhard
Peter Alexander via Digitalmars-d
2014-10-07 16:37:29 UTC
Permalink
Post by grm via Digitalmars-d
When viewing the page source, this link shows up numerous
times. See
https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688
That's because of multiple template instantiations of the same
function. These should probably be filtered for this use case.
Dmitry Olshansky via Digitalmars-d
2014-10-08 07:59:00 UTC
Permalink
Post by grm via Digitalmars-d
1.) It may be helpful to reduce the noise in that every match
after a new is ignored (and probaly multiple 'operator ~'
alarms within the same statement).
The tool currently is quick line-based hack, hence no notion of
statement.
It's indeed a good idea to merge all messages for one statement
and de-duplicate on per statement basis.
Post by grm via Digitalmars-d
When viewing the page source, this link shows up numerous
times. See
https://github.com/D-Programming-Language//phobos/blob/d4d98124ab6cbef7097025a7cfd1161d1963c87e/std/conv.d#L688
There are lots of toImpl overloads, deduplication is done on
module:LOC basis so the all show up. Going to fix in v2 to merge
all of them in one row.
Post by grm via Digitalmars-d
/Gerhard
grm via Digitalmars-d
2014-10-08 17:28:55 UTC
Permalink
Was in a slight hurry and forgot to mention that I (quite sure:
we all) very much appreciate the hands-on mentality your approach
shows.

looking forward to v2

/Gerhard
Walter Bright via Digitalmars-d
2014-10-07 19:24:36 UTC
Permalink
I made a proposal to quantatively measure and tabulate all GC allocations in
After approving node from Andrei I've come up with a piece of automation to
extract this data and post it on wiki.
Thanks, Dmitri, this is great work. I suggest at a minimum that all of those get
notes added to their documentation that they gc allocate.
Brad Anderson via Digitalmars-d
2014-10-07 19:31:51 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Thanks, Dmitri, this is great work. I suggest at a minimum that
all of those get notes added to their documentation that they
gc allocate.
Seems like that's something that should just be automated where
possible instead of trying to update the documentation of
hundreds of functions.
Andrei Alexandrescu via Digitalmars-d
2014-10-07 22:09:55 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Thanks, Dmitri, this is great work. I suggest at a minimum that all of
those get notes added to their documentation that they gc allocate.
Seems like that's something that should just be automated where possible
instead of trying to update the documentation of hundreds of functions.
Could ddoc do that? -- Andrei
Jacob Carlborg via Digitalmars-d
2014-10-07 20:13:31 UTC
Permalink
Post by Dmitry Olshansky via Digitalmars-d
I made a proposal to quantatively measure and tabulate all GC
After approving node from Andrei I've come up with a piece of automation
to extract this data and post it on wiki.
So here is the exhustive list of everything calling into GC in Phobos
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
Including source links, a wild guess at function's name and the
compiler's warning message for potential GC call.
As far as data goes this is about as good as we can get, the next phase
is labeling this stuff with potential solution(s). Again doing all by
hand is tedious and hardly useful.
Instead we need to observe patterns and label it automatically until the
non-trivial subset remains. So everybody, please take time and identify
simple patterns and post back your ideas on solution(s).
- `new SomeException` - switch to RC exceptions
- AA access - ??? (use user-defined AA type as parameter?)
- array concat - ???
- closure - ???
I did some processing of the data and this is the results I got:

772 | 'new' causes GC allocation
515 | operator ~= may cause GC allocation
380 | operator ~ may cause GC allocation
113 | array literal may cause GC allocation
90 | setting 'length' may cause GC allocation
77 | indexing an associative array may cause GC allocation
34 | using closure causes GC allocation
16 | 'delete' requires GC
5 | associative array literal may cause GC allocation

Total 9

I didn't look at any source code to see what "new" is actually
allocating, for example.
--
/Jacob Carlborg
Peter Alexander via Digitalmars-d
2014-10-07 21:59:05 UTC
Permalink
Post by Jacob Carlborg via Digitalmars-d
I didn't look at any source code to see what "new" is actually
allocating, for example.
I did some random sampling, and it's 90% exceptions, with the
occasional array allocation.

I noticed that a lot of the ~ and ~= complaints are in code that
only ever runs at compile time (generating strings for mixin). I
wonder if there's any way we can silence these false positives.
Dmitry Olshansky via Digitalmars-d
2014-10-08 07:52:36 UTC
Permalink
On Tuesday, 7 October 2014 at 20:13:32 UTC, Jacob Carlborg
Post by Jacob Carlborg via Digitalmars-d
I didn't look at any source code to see what "new" is actually
allocating, for example.
I did some random sampling, and it's 90% exceptions, with the
occasional array allocation.
That's interesting. I suspected around 50%. Well that's even
better since if we do ref-counted exceptions we solve 90% of
problem ;)
I noticed that a lot of the ~ and ~= complaints are in code
that only ever runs at compile time (generating strings for
mixin). I wonder if there's any way we can silence these false
positives.
I'm going to use blacklist for these as compiler can't in general
know if it is going to be used exclusively at CTFE or not.

Okay, I think I should go a bit futher with the second version
of the tool.

Things on todo list:
- make tool general enough to work for any GitHub based project
(and hackable for other hostings)
- use Brian's D parser to accurately find artifacts
- detect "throw new SomeStuff" pattern and automatically
populate potential fix line
- list all source links in one coulmn for the same function
(this needs proper parser)
- use blacklist of <module-name>:<artifact name> to filter out
CTFE
- use current data from wiki for "potential fix" column if
present

Holy grail is:
- plot DOT call-graph of GC-users, with leafs being the ones
reported by -vgc. So I start with this list then add functions
them, then functions that use these functions and so on.
Dmitry Olshansky via Digitalmars-d
2014-10-14 13:29:32 UTC
Permalink
On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky
On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander
Okay, I think I should go a bit futher with the second version
of the tool.
- make tool general enough to work for any GitHub based
project (and hackable for other hostings)
- use Brian's D parser to accurately find artifacts
- detect "throw new SomeStuff" pattern and automatically
populate potential fix line
- list all source links in one coulmn for the same function
(this needs proper parser)
- use blacklist of <module-name>:<artifact name> to filter out
CTFE
- use current data from wiki for "potential fix" column if
present
The new version is out, it's a bit rough for a proper
announcement yet and misses a couple of things from my todo list
but the improvement is so radical I decided to share it anyway.

With the new pattern-matcher/parser I hacked together in on top
of Brain's lexer it's now surgically precise in labeling
artifacts. Also I retained as much as possible of original
comments (line numbers have changed), and grouped source links
per artifact.

Updated Wiki:
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage

Tool:
https://github.com/DmitryOlshansky/gchunt


Also it's "universal" as in any github-hosted D project, for
example here is an output for druntime:

http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage

Still todo:
- blacklisting of modules/artifacts
- detect usage of (i)dup
- label throw new xyz as `EX`
- a few bugs to fix in artifact labeling
Chris via Digitalmars-d
2014-10-15 11:25:56 UTC
Permalink
On Tuesday, 14 October 2014 at 13:29:33 UTC, Dmitry Olshansky
Post by Dmitry Olshansky via Digitalmars-d
On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky
On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander
Okay, I think I should go a bit futher with the second
version of the tool.
- make tool general enough to work for any GitHub based
project (and hackable for other hostings)
- use Brian's D parser to accurately find artifacts
- detect "throw new SomeStuff" pattern and automatically
populate potential fix line
- list all source links in one coulmn for the same function
(this needs proper parser)
- use blacklist of <module-name>:<artifact name> to filter out
CTFE
- use current data from wiki for "potential fix" column if
present
The new version is out, it's a bit rough for a proper
announcement yet and misses a couple of things from my todo
list but the improvement is so radical I decided to share it
anyway.
With the new pattern-matcher/parser I hacked together in on top
of Brain's lexer it's now surgically precise in labeling
artifacts. Also I retained as much as possible of original
comments (line numbers have changed), and grouped source links
per artifact.
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
https://github.com/DmitryOlshansky/gchunt
Also it's "universal" as in any github-hosted D project, for
http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage
- blacklisting of modules/artifacts
- detect usage of (i)dup
- label throw new xyz as `EX`
- a few bugs to fix in artifact labeling
Thanks a million! That's very very useful.
Dmitry Olshansky via Digitalmars-d
2014-10-16 19:38:17 UTC
Permalink
Post by Chris via Digitalmars-d
On Tuesday, 14 October 2014 at 13:29:33 UTC, Dmitry Olshansky
Post by Dmitry Olshansky via Digitalmars-d
On Wednesday, 8 October 2014 at 07:52:37 UTC, Dmitry Olshansky
On Tuesday, 7 October 2014 at 21:59:08 UTC, Peter Alexander
Okay, I think I should go a bit futher with the second
version of the tool.
- make tool general enough to work for any GitHub based
project (and hackable for other hostings)
- use Brian's D parser to accurately find artifacts
- detect "throw new SomeStuff" pattern and automatically
populate potential fix line
- list all source links in one coulmn for the same function
(this needs proper parser)
- use blacklist of <module-name>:<artifact name> to filter
out CTFE
- use current data from wiki for "potential fix" column if
present
The new version is out, it's a bit rough for a proper
announcement yet and misses a couple of things from my todo
list but the improvement is so radical I decided to share it
anyway.
With the new pattern-matcher/parser I hacked together in on
top of Brain's lexer it's now surgically precise in labeling
artifacts. Also I retained as much as possible of original
comments (line numbers have changed), and grouped source links
per artifact.
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
https://github.com/DmitryOlshansky/gchunt
Also it's "universal" as in any github-hosted D project, for
http://wiki.dlang.org/Stuff_in_Druntime_That_Generates_Garbage
- blacklisting of modules/artifacts
- detect usage of (i)dup
- label throw new xyz as `EX`
- a few bugs to fix in artifact labeling
Thanks a million! That's very very useful.
I sure hoped so! :)
Sadly I'm going to be incredibly busy this weekend, so the proper
release date shifts to sometime afterwards.

Johannes Pfau via Digitalmars-d
2014-10-08 08:13:47 UTC
Permalink
Am Tue, 07 Oct 2014 21:59:05 +0000
Post by Peter Alexander via Digitalmars-d
Post by Jacob Carlborg via Digitalmars-d
I didn't look at any source code to see what "new" is actually
allocating, for example.
I did some random sampling, and it's 90% exceptions, with the
occasional array allocation.
I noticed that a lot of the ~ and ~= complaints are in code that
only ever runs at compile time (generating strings for mixin). I
wonder if there's any way we can silence these false positives.
Code in if(__ctfe) blocks could be (and should be) allowed:
https://github.com/D-Programming-Language/dmd/pull/3572

But if you have got a normal function (string generateMixin()) the
compiler can't really know that it's only used at compile time. And if
it's not a template the code using the GC will be compiled, even if
it's never called. This might be enough to get undefined symbol errors
if you don't have an GC, so the error messages are kinda valid.
Andrei Alexandrescu via Digitalmars-d
2014-10-08 20:01:43 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
Am Tue, 07 Oct 2014 21:59:05 +0000
Post by Peter Alexander via Digitalmars-d
Post by Jacob Carlborg via Digitalmars-d
I didn't look at any source code to see what "new" is actually
allocating, for example.
I did some random sampling, and it's 90% exceptions, with the
occasional array allocation.
I noticed that a lot of the ~ and ~= complaints are in code that
only ever runs at compile time (generating strings for mixin). I
wonder if there's any way we can silence these false positives.
https://github.com/D-Programming-Language/dmd/pull/3572
But if you have got a normal function (string generateMixin()) the
compiler can't really know that it's only used at compile time. And if
it's not a template the code using the GC will be compiled, even if
it's never called. This might be enough to get undefined symbol errors
if you don't have an GC, so the error messages are kinda valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
code after semantic checking?

Andrei
Andrei Alexandrescu via Digitalmars-d
2014-10-08 20:10:11 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Johannes Pfau via Digitalmars-d
Am Tue, 07 Oct 2014 21:59:05 +0000
Post by Peter Alexander via Digitalmars-d
Post by Jacob Carlborg via Digitalmars-d
I didn't look at any source code to see what "new" is actually
allocating, for example.
I did some random sampling, and it's 90% exceptions, with the
occasional array allocation.
I noticed that a lot of the ~ and ~= complaints are in code that
only ever runs at compile time (generating strings for mixin). I
wonder if there's any way we can silence these false positives.
https://github.com/D-Programming-Language/dmd/pull/3572
But if you have got a normal function (string generateMixin()) the
compiler can't really know that it's only used at compile time. And if
it's not a template the code using the GC will be compiled, even if
it's never called. This might be enough to get undefined symbol errors
if you don't have an GC, so the error messages are kinda valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Steven Schveighoffer via Digitalmars-d
2014-10-08 20:15:51 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Please don't ask me to explain why, because I still don't know. But
_ctfe is a normal runtime variable :) It has been explained to me
before, why it has to be a runtime variable. I think Don knows the answer.

-Steve
Peter Alexander via Digitalmars-d
2014-10-08 20:30:45 UTC
Permalink
On Wednesday, 8 October 2014 at 20:15:51 UTC, Steven
Post by Steven Schveighoffer via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
That's a bummer. Can we get the compiler to remove the "if
(__ctfe)"
code after semantic checking?
Or would "static if (__ctfe)" work? -- Andrei
Please don't ask me to explain why, because I still don't know.
But _ctfe is a normal runtime variable :) It has been explained
to me before, why it has to be a runtime variable. I think Don
knows the answer.
Well, the contents of the static if expression have to be
evaluated at compile time, so static if (__ctfe) would always be
true.

Also, if it were to somehow work as imagined then you'd have
nonsensical things like this:

static if (__ctfe) class Wat {}
auto foo() {
static if (__ctfe) return new Wat();
return null;
}
static wat = foo();

wat now has a type at runtime that only exists at compile time.
bearophile via Digitalmars-d
2014-10-08 20:16:02 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Or would "static if (__ctfe)" work? -- Andrei
Currently it doesn't work, because __ctfe is a run-time variable.
Walter originally tried and failed to make it a compile-time
variable.

Bye,
bearophile
ketmar via Digitalmars-d
2014-10-08 20:20:13 UTC
Permalink
On Wed, 08 Oct 2014 13:10:11 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d at puremagic.com>
Post by Andrei Alexandrescu via Digitalmars-d
Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static
if" evaluates when function is *compiling*, __ctfe is false there, and
so the whole "true" branch will be removed as dead code.

i believe that compiler should warn about this, 'cause i'm tend to
repeatedly hit this funny thing.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141008/9aacf6b0/attachment.sig>
ketmar via Digitalmars-d
2014-10-08 20:25:18 UTC
Permalink
On Wed, 8 Oct 2014 23:20:13 +0300
ketmar via Digitalmars-d <digitalmars-d at puremagic.com> wrote:

p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
code will be removed. sorry, i can't really remember what is true, but
anyway, it works by removeing one of the branches altogether.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141008/23307f2d/attachment.sig>
ketmar via Digitalmars-d
2014-10-08 20:33:40 UTC
Permalink
On Wed, 8 Oct 2014 23:25:18 +0300
Post by ketmar via Digitalmars-d
On Wed, 8 Oct 2014 23:20:13 +0300
p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
code will be removed. sorry, i can't really remember what is true, but
anyway, it works by removeing one of the branches altogether.
hm. i need some sleep. or new keyboard. or both.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141008/df8361a1/attachment-0001.sig>
Timon Gehr via Digitalmars-d
2014-10-08 21:40:01 UTC
Permalink
Post by ketmar via Digitalmars-d
On Wed, 8 Oct 2014 23:20:13 +0300
p.s. or vice versa: "static if (__ctfe)" is always true, to non-ctfe
code will be removed. sorry, i can't really remember what is true, but
anyway, it works by removeing one of the branches altogether.
This is probably a regression somewhere after 2.060, because with 2.060
I get

Error: variable __ctfe cannot be read at compile time
Error: expression __ctfe is not constant or does not evaluate to a bool

as I'd expect.
ketmar via Digitalmars-d
2014-10-08 22:18:29 UTC
Permalink
On Wed, 08 Oct 2014 23:40:01 +0200
Post by Timon Gehr via Digitalmars-d
This is probably a regression somewhere after 2.060, because with
2.060 I get
Error: variable __ctfe cannot be read at compile time
Error: expression __ctfe is not constant or does not evaluate to a bool
as I'd expect.
i remember now that i was copypasting toHash() from druntime some time
ago and changed "if (__ctfe)" to "static if (__ctfe)" in process. it
compiles and works fine, and i don't even noticed what i did until i
tried to change non-ctfe part of toHash() and found that my changes had
no effect at all. and then i discovered that "static".

this was 2.066 or 2.067-git.

and now i can clearly say that "static if (__ctfe)" leaving only ctfe
part.

that was somewhat confusing, as i was pretty sure that "if (__ctfe)"
*must* be used with "static".
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141009/76b0a47f/attachment.sig>
Marco Leise via Digitalmars-d
2014-10-10 07:14:28 UTC
Permalink
Am Wed, 8 Oct 2014 23:20:13 +0300
Post by ketmar via Digitalmars-d
On Wed, 08 Oct 2014 13:10:11 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d at puremagic.com>
Post by Andrei Alexandrescu via Digitalmars-d
Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static
if" evaluates when function is *compiling*, __ctfe is false there, and
so the whole "true" branch will be removed as dead code.
i believe that compiler should warn about this, 'cause i'm tend to
repeatedly hit this funny thing.
Lol, definitely! I made that mistake myself and Robert
Schadek, too in his std.logger. It is now the #1 bug in D code.
--
Marco
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141010/24ed4250/attachment-0001.sig>
ketmar via Digitalmars-d
2014-10-10 21:21:41 UTC
Permalink
On Fri, 10 Oct 2014 09:14:28 +0200
Post by Marco Leise via Digitalmars-d
Am Wed, 8 Oct 2014 23:20:13 +0300
Post by ketmar via Digitalmars-d
On Wed, 08 Oct 2014 13:10:11 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d at puremagic.com>
Post by Andrei Alexandrescu via Digitalmars-d
Or would "static if (__ctfe)" work? -- Andrei
ha! The Famous Bug! it works, but not as people expected. as "static
if" evaluates when function is *compiling*, __ctfe is false there,
and so the whole "true" branch will be removed as dead code.
i believe that compiler should warn about this, 'cause i'm tend to
repeatedly hit this funny thing.
Lol, definitely! I made that mistake myself and Robert
Schadek, too in his std.logger. It is now the #1 bug in D code.
i made a quick patch that warns on "static if (__ctfe)":
https://issues.dlang.org/show_bug.cgi?id=13601
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20141011/3c35eebf/attachment.sig>
Johannes Pfau via Digitalmars-d
2014-10-09 09:43:04 UTC
Permalink
Am Wed, 08 Oct 2014 13:01:43 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Johannes Pfau via Digitalmars-d
https://github.com/D-Programming-Language/dmd/pull/3572
But if you have got a normal function (string generateMixin()) the
compiler can't really know that it's only used at compile time. And
if it's not a template the code using the GC will be compiled, even
if it's never called. This might be enough to get undefined symbol
errors if you don't have an GC, so the error messages are kinda
valid.
That's a bummer. Can we get the compiler to remove the "if (__ctfe)"
code after semantic checking?
Andrei
I think you misunderstood, code in if(__ctfe) is already removed, it
never gets into the binary. But the @nogc/-vgc checks still complain
about GC allocations in if(__ctfe). This is easy to fix, but as ctfe is
a runtime variable you could also do (if(__ctfe || dice() == 1 )) and
the decision about complex cases stopped pull #3572.

What I meant is that the compiler can't know that this code is
CTFE-only and -vgc must complain:

string generateMixin(string a)
{return "int " ~ a ~ ";";}
mixin(generateMixin());

But there are workarounds:
http://dpaste.dzfl.pl/e689585c0a95
(Note that dead-code elimination should be able to remove all functions
marked as private)
Andrei Alexandrescu via Digitalmars-d
2014-10-07 22:35:52 UTC
Permalink
Post by Dmitry Olshansky via Digitalmars-d
I made a proposal to quantatively measure and tabulate all GC
After approving node from Andrei I've come up with a piece of automation
to extract this data and post it on wiki.
So here is the exhustive list of everything calling into GC in Phobos
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
Awesome! I've started adding explanations to the first few entries,
let's use that crowdsourcing thing to fill this! -- Andrei
Xinok via Digitalmars-d
2014-10-08 02:28:40 UTC
Permalink
On Tuesday, 7 October 2014 at 15:57:59 UTC, Dmitry Olshansky
Post by Dmitry Olshansky via Digitalmars-d
So here is the exhustive list of everything calling into GC in
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
A correction on TimSortImpl, it does actually generate garbage by
calling uninitializedArray to allocate the buffer. The check for
__ctfe is unnecessary (it may have been needed sometime ago or
was added naively).

Timsort is an O(n/2) algorithm and requires a buffer, but there's
no reason for it to be GC-allocated. It could simply be malloc'd
and free'd before the function returns.
Johannes Pfau via Digitalmars-d
2014-10-08 11:25:21 UTC
Permalink
Am Tue, 07 Oct 2014 15:57:58 +0000
Post by Dmitry Olshansky via Digitalmars-d
Instead we need to observe patterns and label it automatically
until the non-trivial subset remains. So everybody, please take
time and identify simple patterns and post back your ideas on
solution(s).
I just had a look at all closure allocations and identified these
patterns:


1) Fixable by manually stack-allocating closure
A delegate is passed to some function which stores this delegate and
therefore correctly doesn't mark the parameter as scope. However,
the lifetime of the stored delegate is still limited to the current
function (e.g. it's stored in a struct instance, but on the stack).

Can be fixed by creating a static struct{T... members; void
doSomething(){access members}} instance on stack and passing
&stackvar.doSomething as delegate.

2) Using delegates to add state to ranges
----
return iota(dim).
filter!(i => ptr[i])().
map!(i => BitsSet!size_t(ptr[i], i * bitsPerSizeT))().
joiner();
----
This code adds state to ranges without declaring a new type: the ptr
variable is not accessible and needs to be move into a closure.
Declaring a custom range type is a solution, but not
straightforward: If the ptr field is moved into the range a closure
is not necessary. But if the range is copied, it's address changes
and the delegate passed to map is now invalid.

3) Functions taking delegates as generic parameters
receiveTimeout,receive,formattedWrite accept different types,
including delegates. The delegates can all be scope to avoid the
allocation but is void foo(T)(scope T) a good idea? The alternative
is probably making an overload for delegates with scope attribute.

(The result is that all functions calling receiveTimeout,... with a
delegate allocate a closure)

4) Solvable with manual memory management
Some specific functions can't be easily fixed, but the delegates
they create have a well defined lifetime (for example spawn creates
a delegate which is only needed at the startup of a new thread, it's
never used again). These could be malloc+freed.

5) Design issue
These functions generally create a delegate using variables passed
in as parameters. There's no way to avoid closures here. Although
manual allocation is an possible, the lifetime is undefined and can
only be managed by the GC.

6) Other
Two cases can be fixed by moving a buffer into a struct or moving a
function out of a member function into it's surrounding class.


Also notable: 17 out of 35 cases are in std.net.curl. This is because
curl heavily uses delegates and wrapper delegates.
Dmitry Olshansky via Digitalmars-d
2014-10-08 12:08:59 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
Am Tue, 07 Oct 2014 15:57:58 +0000
Post by Dmitry Olshansky via Digitalmars-d
Instead we need to observe patterns and label it automatically
until the non-trivial subset remains. So everybody, please
take time and identify simple patterns and post back your
ideas on solution(s).
I just had a look at all closure allocations and identified
these
Awesome! This is exactly the kind of help I wanted.
Post by Johannes Pfau via Digitalmars-d
1) Fixable by manually stack-allocating closure
A delegate is passed to some function which stores this
delegate and
therefore correctly doesn't mark the parameter as scope.
However,
the lifetime of the stored delegate is still limited to the
current
function (e.g. it's stored in a struct instance, but on the
stack).
Can be fixed by creating a static struct{T... members; void
doSomething(){access members}} instance on stack and passing
&stackvar.doSomething as delegate.
Hm... Probably we can create a template for this.
Post by Johannes Pfau via Digitalmars-d
2) Using delegates to add state to ranges
----
return iota(dim).
filter!(i => ptr[i])().
map!(i => BitsSet!size_t(ptr[i], i * bitsPerSizeT))().
joiner();
----
This code adds state to ranges without declaring a new type: the ptr
variable is not accessible and needs to be move into a
closure.
Declaring a custom range type is a solution, but not
straightforward: If the ptr field is moved into the range a
closure
is not necessary. But if the range is copied, it's address
changes
and the delegate passed to map is now invalid.
Indeed, such code is fine in "user-space" but have no place in
the library.
Post by Johannes Pfau via Digitalmars-d
3) Functions taking delegates as generic parameters
receiveTimeout,receive,formattedWrite accept different types,
including delegates. The delegates can all be scope to avoid the
allocation but is void foo(T)(scope T) a good idea? The
alternative
is probably making an overload for delegates with scope
attribute.
(The result is that all functions calling receiveTimeout,... with a
delegate allocate a closure)
4) Solvable with manual memory management
Some specific functions can't be easily fixed, but the
delegates
they create have a well defined lifetime (for example spawn
creates
a delegate which is only needed at the startup of a new
thread, it's
never used again). These could be malloc+freed.
I think this and (2) can be solved if we come up with solid
support for RC-closures.
Post by Johannes Pfau via Digitalmars-d
5) Design issue
These functions generally create a delegate using variables
passed
in as parameters. There's no way to avoid closures here.
Although
manual allocation is an possible, the lifetime is undefined
and can
only be managed by the GC.
6) Other
Two cases can be fixed by moving a buffer into a struct or
moving a
function out of a member function into it's surrounding
class.
Yeah, there are always outliers ;)
Post by Johannes Pfau via Digitalmars-d
Also notable: 17 out of 35 cases are in std.net.curl. This is
because
curl heavily uses delegates and wrapper delegates.
Interesting... it must be due to cURL callback-based API.
All in all, std.net.curl is a constant source of complaints, it
may need some work to fix other issues anyway.
Kagamin via Digitalmars-d
2014-10-08 16:59:46 UTC
Permalink
On Wednesday, 8 October 2014 at 12:09:00 UTC, Dmitry Olshansky
Post by Dmitry Olshansky via Digitalmars-d
I think this and (2) can be solved if we come up with solid
support for RC-closures.
Delegates don't obey data sharing type checks though. A long
standing language issue.
Johannes Pfau via Digitalmars-d
2014-10-09 10:01:30 UTC
Permalink
Am Tue, 07 Oct 2014 15:57:58 +0000
Post by Dmitry Olshansky via Digitalmars-d
I made a proposal to quantatively measure and tabulate all GC
Phobos".
After approving node from Andrei I've come up with a piece of
automation to extract this data and post it on wiki.
So here is the exhustive list of everything calling into GC in
http://wiki.dlang.org/Stuff_in_Phobos_That_Generates_Garbage
Including source links, a wild guess at function's name and the
compiler's warning message for potential GC call.
As far as data goes this is about as good as we can get, the next
phase is labeling this stuff with potential solution(s). Again
doing all by hand is tedious and hardly useful.
Instead we need to observe patterns and label it automatically
until the non-trivial subset remains. So everybody, please take
time and identify simple patterns and post back your ideas on
solution(s).
- `new SomeException` - switch to RC exceptions
- AA access - ??? (use user-defined AA type as parameter?)
- array concat - ???
- closure - ???
---
Dmitry Olshansky
Another observation: idup/dup are not reported by -vgc (This is correct
behavior. @nogc detects these as normal functions without @nogc
attribute and complains. -vgc does not report calls to non- at nogc
functions).
However, idup/dup might be common and it might make sense to grep for
them manually?
Continue reading on narkive:
Loading...