Discussion:
RFC: moving forward with @nogc Phobos
Andrei Alexandrescu via Digitalmars-d
2014-09-29 10:49:52 UTC
Permalink
Back when I've first introduced RCString I hinted that we have a larger
strategy in mind. Here it is.

The basic tenet of the approach is to reckon and act on the fact that
memory allocation (the subject of allocators) is an entirely distinct
topic from memory management, and more generally resource management.
This clarifies that it would be wrong to approach alternatives to GC in
Phobos by means of allocators. GC is not only an approach to memory
allocation, but also an approach to memory management. Reducing it to
either one is a mistake. In hindsight this looks rather obvious but it
has caused me and many people better than myself a lot of headache.

That said allocators are nice to have and use, and I will definitely
follow up with std.allocator. However, std.allocator is not the key to a
@nogc Phobos.

Nor are ranges. There is an attitude that either output ranges, or input
ranges in conjunction with lazy computation, would solve the issue of
creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a good
illustration of the latter approach: a range would be lazily created by
chaining stuff together. A range-based approach would take us further
than the allocators, but I see the following issues with it:

(a) the whole approach doesn't stand scrutiny for non-linear outputs,
e.g. outputting some sort of associative array or really any composite
type quickly becomes tenuous either with an output range (eager) or with
exposing an input range (lazy);

(b) makes the style of programming without GC radically different, and
much more cumbersome, than programming with GC; as a consequence,
programmers who consider changing one approach to another, or
implementing an algorithm neutral to it, are looking at a major rewrite;

(c) would make D/@nogc a poor cousin of C++. This is quite out of
character; technically, I have long gotten used to seeing most elaborate
C++ code like poor emulation of simple D idioms. But C++ has spent years
and decades taking to perfection an approach without a tracing garbage
collector. A departure from that would need to be superior, and that
doesn't seem to be the case with range-based approaches.

===========

Now that we clarified that these existing attempts are not going to work
well, the question remains what does. For Phobos I'm thinking of
defining and using three policies:

enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;

The three policies are:

(a) gc is the classic garbage-collected style of management;

(b) rc is a reference-counted style still backed by the GC, i.e. the GC
will still be able to pick up cycles and other kinds of leaks.

(c) mrc is a reference-counted style backed by malloc.

(It should be possible to collapse rc and mrc together and make the
distinction dynamically, at runtime. I'm distinguishing them statically
here for expository purposes.)

The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to return. Consider:

auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}

On the caller side:

auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc

So by default it's going to continue being business as usual, but
certain functions will allow passing in a (defaulted) policy for memory
management.

Destroy!


Andrei
Daniel Kozak via Digitalmars-d
2014-09-29 11:03:44 UTC
Permalink
V Mon, 29 Sep 2014 03:49:52 -0700
Andrei Alexandrescu via Digitalmars-d <digitalmars-d at puremagic.com>
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have a
larger strategy in mind. Here it is.
The basic tenet of the approach is to reckon and act on the fact that
memory allocation (the subject of allocators) is an entirely distinct
topic from memory management, and more generally resource management.
This clarifies that it would be wrong to approach alternatives to GC
in Phobos by means of allocators. GC is not only an approach to
memory allocation, but also an approach to memory management.
Reducing it to either one is a mistake. In hindsight this looks
rather obvious but it has caused me and many people better than
myself a lot of headache.
That said allocators are nice to have and use, and I will definitely
follow up with std.allocator. However, std.allocator is not the key
Nor are ranges. There is an attitude that either output ranges, or
input ranges in conjunction with lazy computation, would solve the
issue of creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a good
illustration of the latter approach: a range would be lazily created
by chaining stuff together. A range-based approach would take us
(a) the whole approach doesn't stand scrutiny for non-linear outputs,
e.g. outputting some sort of associative array or really any
composite type quickly becomes tenuous either with an output range
(eager) or with exposing an input range (lazy);
(b) makes the style of programming without GC radically different,
and much more cumbersome, than programming with GC; as a consequence,
programmers who consider changing one approach to another, or
implementing an algorithm neutral to it, are looking at a major rewrite;
character; technically, I have long gotten used to seeing most
elaborate C++ code like poor emulation of simple D idioms. But C++
has spent years and decades taking to perfection an approach without
a tracing garbage collector. A departure from that would need to be
superior, and that doesn't seem to be the case with range-based
approaches.
===========
Now that we clarified that these existing attempts are not going to
work well, the question remains what does. For Phobos I'm thinking of
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC, i.e. the
GC will still be able to pick up cycles and other kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make the
distinction dynamically, at runtime. I'm distinguishing them
statically here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path,
R2 ext) if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual, but
certain functions will allow passing in a (defaulted) policy for
memory management.
Destroy!
Andrei
I would add something like this:

@DefaultMemoryManagementPolicy(rc)
module A;

void main() {
auto p1 = setExtension("hello", ".txt"); // use rc
}
Andrei Alexandrescu via Digitalmars-d
2014-09-29 11:25:08 UTC
Permalink
Post by Daniel Kozak via Digitalmars-d
@DefaultMemoryManagementPolicy(rc)
module A;
void main() {
auto p1 = setExtension("hello", ".txt"); // use rc
}
(please don't overquote!)

Yah, I realized I forgot to mention this: if we play our cards right, a
lot of code will build in both approaches to memory management by just
flipping a switch. In particular, the switch can be defaulted to
something else.

I was thinking of leaving it to the user:

module A;
immutable myMMP = rc;

void main() {
auto p1 = setExtension!myMMP("hello", ".txt");
}


Andrei
Daniel N via Digitalmars-d
2014-09-29 11:35:22 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have
a larger strategy in mind. Here it is.
The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
How about having something like ResourceManagementPolicy.infer,
which under the hood could work something like below... you could
combine it with your original suggestion, with an overridable
MemoryManagementPolicy(just removed it to make the example
shorter)

auto setExtension(R1, R2)(R1 path, R2 ext)
if (...)
{
static if(functionAttributes!(__traits(parent, setExtension))
& FunctionAttribute.nogc)
alias S = RCString;
else
alias S = string;
...
return result;
}

Daniel N
eles via Digitalmars-d
2014-09-29 11:36:59 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
entirely distinct topic
Finally!
eles via Digitalmars-d
2014-09-29 12:03:56 UTC
Permalink
Post by Daniel N via Digitalmars-d
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei
entirely distinct topic
Finally!
Sorry, enthusiasm. I really think this is the key for doing the
management of all resources in the right way. For me, the memory
should be seen as a resource that simply happens to have the
possibility of being manageable in a more flexible way and with
specific constraints.

For example, with respect to other kind of resources, you could
use a lazy approach to deallocate memory, as unlike many other
resources memory is like money: is fungible [1]. Other resources
are not. OTOH, the memory comes with some of its own quirks, such
as the cycles (these could be, in theory, possible for other kind
of resources, but are exceptions).

Memory management is not necessarily deterministic neither. Other
resources might require determinism, however.

[1] http://en.wikipedia.org/wiki/Fungibility
Vladimir Panteleev via Digitalmars-d
2014-09-29 12:06:09 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Is this practically feasible without blowing up Phobos several
times in size and complexity?

And I'm not sure adding a template parameter to every function is
going to work well, what with all the existing template
parameters - especially the optional ones.
Andrei Alexandrescu via Digitalmars-d
2014-09-29 15:17:37 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Is this practically feasible without blowing up Phobos several times in
size and complexity?
I believe so. For the most part implementations will be identical - just
look at the RCString primitives, which are virtually the same as string's.
And I'm not sure adding a template parameter to every function is going
to work well, what with all the existing template parameters -
especially the optional ones.
Not all functions, just those that allocate. I agree there will be a few
decisions to be made there.


Andrei
Dicebot via Digitalmars-d
2014-09-29 12:29:32 UTC
Permalink
Any assumption that library code can go away with some set of
pre-defined allocation strategies is crap. This whole discussion
was about how important it is to move allocation decisions to
user code (ranges are just one tool to achieve that, Don has been
presenting examples of how we do that with plain arrays in DConf
2014 talk).

In that regard allocators + ranges are still the way to go in my
opinion. Yes, sometimes those result in very hard to use API -
providing GC-heavy but friendly alternatives for those shouldn't
do any harm. But in general full decoupling of algorithms from
allocations is necessary. If that makes D poor cousin of C++ we
may have a learn few tricks from C++.
Andrei Alexandrescu via Digitalmars-d
2014-09-29 15:17:31 UTC
Permalink
Post by Dicebot via Digitalmars-d
Any assumption that library code can go away with some set of
pre-defined allocation strategies is crap. This whole discussion was
about how important it is to move allocation decisions to user code
(ranges are just one tool to achieve that, Don has been presenting
examples of how we do that with plain arrays in DConf 2014 talk).
That's making exactly the confusion I was - that memory allocation
strategy is the same as memory management strategy.
Post by Dicebot via Digitalmars-d
In that regard allocators + ranges are still the way to go in my
opinion. Yes, sometimes those result in very hard to use API - providing
GC-heavy but friendly alternatives for those shouldn't do any harm. But
in general full decoupling of algorithms from allocations is necessary.
If that makes D poor cousin of C++ we may have a learn few tricks from C++.
As long as things are trivial they can be done with relative ease,
albeit with more pain. But consider e.g. the recent JSON library by
Sönke. It needs to create a lookup data structure and return things like
strings from it. What primitives do you think could it define?


Andrei
Dicebot via Digitalmars-d
2014-09-29 15:53:03 UTC
Permalink
On Monday, 29 September 2014 at 15:18:40 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dicebot via Digitalmars-d
Any assumption that library code can go away with some set of
pre-defined allocation strategies is crap. This whole
discussion was
about how important it is to move allocation decisions to user code
(ranges are just one tool to achieve that, Don has been
presenting
examples of how we do that with plain arrays in DConf 2014
talk).
That's making exactly the confusion I was - that memory
allocation strategy is the same as memory management strategy.
Yes but neither decision belongs to library code except for very
rare cases.
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dicebot via Digitalmars-d
In that regard allocators + ranges are still the way to go in
my
opinion. Yes, sometimes those result in very hard to use API - providing
GC-heavy but friendly alternatives for those shouldn't do any
harm. But
in general full decoupling of algorithms from allocations is
necessary.
If that makes D poor cousin of C++ we may have a learn few
tricks from C++.
As long as things are trivial they can be done with relative
ease, albeit with more pain. But consider e.g. the recent JSON
library by Sönke. It needs to create a lookup data structure
and return things like strings from it. What primitives do you
think could it define?
Sounds like it may have to define own kind of allocator with
certain implementation restrictions (and implement it in terms of
GC by default). I have not actually read the code for that
proposal so hard to guess. Will need to do it if it really
matters.
Andrei Alexandrescu via Digitalmars-d
2014-09-29 17:04:53 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dicebot via Digitalmars-d
Any assumption that library code can go away with some set of
pre-defined allocation strategies is crap. This whole discussion was
about how important it is to move allocation decisions to user code
(ranges are just one tool to achieve that, Don has been presenting
examples of how we do that with plain arrays in DConf 2014 talk).
That's making exactly the confusion I was - that memory allocation
strategy is the same as memory management strategy.
Yes but neither decision belongs to library code except for very rare
cases.
You just assert it, so all I can say is "I understand you believe this".
I've motivated my argument. You may want to do the same for yours.
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dicebot via Digitalmars-d
In that regard allocators + ranges are still the way to go in my
opinion. Yes, sometimes those result in very hard to use API - providing
GC-heavy but friendly alternatives for those shouldn't do any harm. But
in general full decoupling of algorithms from allocations is necessary.
If that makes D poor cousin of C++ we may have a learn few tricks from C++.
As long as things are trivial they can be done with relative ease,
albeit with more pain. But consider e.g. the recent JSON library by
Sönke. It needs to create a lookup data structure and return things
like strings from it. What primitives do you think could it define?
Sounds like it may have to define own kind of allocator with certain
implementation restrictions (and implement it in terms of GC by
default). I have not actually read the code for that proposal so hard to
guess. Will need to do it if it really matters.
So you don't have an answer. And again you are confusing memory
allocation with memory management.

I have sketched an approach that works and will take us to Phobos being
most transparently usable with tracing collection or with reference
counting. Part of that is RCString (and generally reference counted
slices and hashtables), and another part is the @refcounted attribute
for classes. I will push it through. If you have any objections, it
would be great if you argued them properly.


Thanks,

Andrei
Dicebot via Digitalmars-d
2014-09-29 17:19:23 UTC
Permalink
On Monday, 29 September 2014 at 17:04:54 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dicebot via Digitalmars-d
Yes but neither decision belongs to library code except for
very rare
cases.
You just assert it, so all I can say is "I understand you
believe this". I've motivated my argument. You may want to do
the same for yours.
I probably have missed the part with arguments :) Your reasoning
is not fundamentally different from "GC should be enough" but
extended to several options from single one.

My argument is simple - one can't forsee everything. I remember
reading book of one guy who has been advocating thing called
"policy-based design", you may know him ;) Was quite impressed
with the simple but practical basic idea - decoupling parts of
the implementation that are not inherently related.
Post by Andrei Alexandrescu via Digitalmars-d
So you don't have an answer. And again you are confusing memory
allocation with memory management.
Yes, sorry, I don't have an answer. Or time do deeply dive into
the code unless it is really important or my direct
responsibility.

Unfortunately, I don't see an answer how your proposal fits our
code either. Most of Sociomantic code relies on using arrays as
ref arguments to avoid creating of new GC roots (no, we don't
need/want to switch to ARC). This was several times called as the
reason why Phobos in its current shape is largely unusable for
out needs even when D2 switch is finished. I don't see how
proposal in original post changes that.
Andrei Alexandrescu via Digitalmars-d
2014-09-29 22:18:39 UTC
Permalink
Post by Dicebot via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Yes but neither decision belongs to library code except for very rare
cases.
You just assert it, so all I can say is "I understand you believe
this". I've motivated my argument. You may want to do the same for yours.
I probably have missed the part with arguments :)
The basic tenet of the approach is to reckon and act on the fact that memory allocation (the subject of allocators) is an entirely distinct topic from memory management, and more generally resource management. This clarifies that it would be wrong to approach alternatives to GC in Phobos by means of allocators. GC is not only an approach to memory allocation, but also an approach to memory management. Reducing it to either one is a mistake. In hindsight this looks rather obvious but it has caused me and many people better than myself a lot of headache.
(a) the whole approach doesn't stand scrutiny for non-linear outputs, e.g. outputting some sort of associative array or really any composite type quickly becomes tenuous either with an output range (eager) or with exposing an input range (lazy);
(b) makes the style of programming without GC radically different, and much more cumbersome, than programming with GC; as a consequence, programmers who consider changing one approach to another, or implementing an algorithm neutral to it, are looking at a major rewrite;
=================
Post by Dicebot via Digitalmars-d
Your reasoning is not
fundamentally different from "GC should be enough" but extended to
several options from single one.
Where's RC in the "GC should be enough"?
Post by Dicebot via Digitalmars-d
My argument is simple - one can't forsee everything. I remember reading
book of one guy who has been advocating thing called "policy-based
design", you may know him ;) Was quite impressed with the simple but
practical basic idea - decoupling parts of the implementation that are
not inherently related.
Totally. Then it would be great if you trusted the guy when he makes a
judgment call in which reasonable people may disagree.

There are many memory /allocation/ policies but precious few memory
/management/ policies. I only know "manual", "scoped", "reference
counted", and "tracing" based on... the last 50 years of software
development.
Post by Dicebot via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
So you don't have an answer. And again you are confusing memory
allocation with memory management.
Yes, sorry, I don't have an answer. Or time do deeply dive into the code
unless it is really important or my direct responsibility.
Unfortunately, I don't see an answer how your proposal fits our code
either. Most of Sociomantic code relies on using arrays as ref arguments
to avoid creating of new GC roots (no, we don't need/want to switch to
ARC). This was several times called as the reason why Phobos in its
current shape is largely unusable for out needs even when D2 switch is
finished. I don't see how proposal in original post changes that.
Passing arrays by reference is plenty adequate with all memory
management strategies. You'll need to wait and see how the proposal
changes that, but if you naysay, back it up.


Andrei
Dicebot via Digitalmars-d
2014-09-29 22:43:03 UTC
Permalink
On Monday, 29 September 2014 at 22:18:38 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Passing arrays by reference is plenty adequate with all memory
management strategies. You'll need to wait and see how the
proposal changes that, but if you naysay, back it up.
Resisting to go on meaningless argument on other points, this
pretty much says that focus on things that are important for me
is abandoned in favor of something that mostly doesn't matter. Am
I supposed to be happy? :) Am I supposed to be twice as happy
when you propose to close pull requests that do help because of
this proposal?

I am waiting for what comes next but right now "not impressed" is
most optimistic way to put this. Sorry :(
Andrei Alexandrescu via Digitalmars-d
2014-09-29 23:40:44 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Passing arrays by reference is plenty adequate with all memory
management strategies. You'll need to wait and see how the proposal
changes that, but if you naysay, back it up.
Resisting to go on meaningless argument on other points, this pretty
much says that focus on things that are important for me is abandoned in
favor of something that mostly doesn't matter. Am I supposed to be
happy? :) Am I supposed to be twice as happy when you propose to close
pull requests that do help because of this proposal?
I am waiting for what comes next but right now "not impressed" is most
optimistic way to put this. Sorry :(
I trust you'll be. -- Andrei
Chris Williams via Digitalmars-d
2014-09-29 18:14:16 UTC
Permalink
Post by Dicebot via Digitalmars-d
Any assumption that library code can go away with some set of
pre-defined allocation strategies is crap. This whole
discussion was about how important it is to move allocation
decisions to user code (ranges are just one tool to achieve
that, Don has been presenting examples of how we do that with
plain arrays in DConf 2014 talk).
I think the key to this sort of issue is to try and get as much
functionality in Phobos marked @nogc as possible. After that,
building new library-like functionality into a DUB package that
assumes @nogc and only uses the @nogc code in Phobos would be the
next step. Should that get to a state where it's popular and
supported, pulling it in as std.nogc.* might make sense, but
trying to redo Phobos as a manual memory collection library is
infeasible.

Were I your company, I'd start working on leading such an effort.

Unlike Tango, I don't think a development like this would split
the community nor the community's resources in a useless fashion.
Paulo Pinto via Digitalmars-d
2014-09-29 17:16:00 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
[...]
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC, i.e. the GC
will still be able to pick up cycles and other kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make the
distinction dynamically, at runtime. I'm distinguishing them statically
here for expository purposes.)
...
Personally, I would go just for (b) with compiler support for
increment/decrement removal, as I think it will be too complex having to
support everything and this will complicate all libraries.

Anyway, that was just my 0.02€. Stepping out the thread as I just toy
around with D and cannot add much more to the discussion.

--
Paulo
Andrei Alexandrescu via Digitalmars-d
2014-09-29 22:04:03 UTC
Permalink
Post by Paulo Pinto via Digitalmars-d
Personally, I would go just for (b) with compiler support for
increment/decrement removal, as I think it will be too complex having to
support everything and this will complicate all libraries.
Compiler already knows (after inlining) that ++i and --i cancel each
other, so we should be in good shape there. -- Andrei
Marco Leise via Digitalmars-d
2014-09-30 11:24:56 UTC
Permalink
Am Mon, 29 Sep 2014 15:04:03 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Paulo Pinto via Digitalmars-d
Personally, I would go just for (b) with compiler support for
increment/decrement removal, as I think it will be too complex having to
support everything and this will complicate all libraries.
Compiler already knows (after inlining) that ++i and --i cancel each
other, so we should be in good shape there. -- Andrei
That helps with very small, inlined functions until Marc
SchÃŒtz's work on borrowed pointers makes it redundant by
unifying scoped copies of GC, RC and stack pointers.
In any case inc/dec elision is an optimization and and not an
enabling feature. It sure is on the radar and can be improved
later on.
--
Marco
Manu via Digitalmars-d
2014-10-01 01:26:06 UTC
Permalink
On 30 September 2014 08:04, Andrei Alexandrescu via Digitalmars-d
Post by Paulo Pinto via Digitalmars-d
Personally, I would go just for (b) with compiler support for
increment/decrement removal, as I think it will be too complex having to
support everything and this will complicate all libraries.
Compiler already knows (after inlining) that ++i and --i cancel each other,
so we should be in good shape there. -- Andrei
The compiler doesn't know that MyLibrary_AddRef(Thing *t); and
MyLibrary_DecRef(Thing *t); cancel eachother out though...
rc needs primitives that the compiler understands implicitly, so that
rc logic can be more complex than ++i/--i;
deadalnix via Digitalmars-d
2014-10-01 01:41:56 UTC
Permalink
On Wednesday, 1 October 2014 at 01:26:45 UTC, Manu via
Post by Manu via Digitalmars-d
On 30 September 2014 08:04, Andrei Alexandrescu via
Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Paulo Pinto via Digitalmars-d
Personally, I would go just for (b) with compiler support for
increment/decrement removal, as I think it will be too
complex having to
support everything and this will complicate all libraries.
Compiler already knows (after inlining) that ++i and --i
cancel each other,
so we should be in good shape there. -- Andrei
The compiler doesn't know that MyLibrary_AddRef(Thing *t); and
MyLibrary_DecRef(Thing *t); cancel eachother out though...
rc needs primitives that the compiler understands implicitly,
so that
rc logic can be more complex than ++i/--i;
Even with simply i++ and i--, the information that they always go
by pair is lost on the compiler in many cases.
Jacob Carlborg via Digitalmars-d
2014-09-29 17:25:54 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Now that we clarified that these existing attempts are not going to work
well, the question remains what does. For Phobos I'm thinking of
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC, i.e. the GC
will still be able to pick up cycles and other kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make the
distinction dynamically, at runtime. I'm distinguishing them statically
here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
How does allocators fit in this? Will it be an additional argument to
the function. Or a separate stack that one can push and pop allocators to?
--
/Jacob Carlborg
Andrei Alexandrescu via Digitalmars-d
2014-09-29 22:11:26 UTC
Permalink
Post by Jacob Carlborg via Digitalmars-d
How does allocators fit in this? Will it be an additional argument to
the function. Or a separate stack that one can push and pop allocators to?
There would be one allocator per thread (changeable) deferring to a
global interlocked allocator. Most algorithms would just use whatever
allocator is installed.

I know the notion of a thread-local and then global allocator is liable
to cause some an apoplexy attack. But it's time to model things as they
are - memory is a global resource and it ought to be treated as such. No
need to pass allocators around except for special cases.


Andrei
Johannes Pfau via Digitalmars-d
2014-09-30 08:34:30 UTC
Permalink
Am Mon, 29 Sep 2014 15:11:26 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Jacob Carlborg via Digitalmars-d
How does allocators fit in this? Will it be an additional argument
to the function. Or a separate stack that one can push and pop
allocators to?
There would be one allocator per thread (changeable) deferring to a
global interlocked allocator. Most algorithms would just use whatever
allocator is installed.
I know the notion of a thread-local and then global allocator is
liable to cause some an apoplexy attack. But it's time to model
things as they are - memory is a global resource and it ought to be
treated as such. No need to pass allocators around except for special
cases.
Andrei
No need to pass allocators around except for special
cases.
So you propose RC + global/thread local allocators as the solution for
all memory related problems as 'memory management is not allocation'.
And you claim that using output ranges / providing buffers / allocators
is not an option because it only works in some special cases?

What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?

If I want std.string.toStringz to put the result into a temporary stack
buffer your solution doesn't help at all. Passing an ouput range,
allocator or buffer would all solve this.
Peter Alexander via Digitalmars-d
2014-09-30 10:41:14 UTC
Permalink
On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I
want a
function to use a stack buffer? Or if I want to free manually?
Agreed. This is the common case we need to solve for, but this is
memory allocation, not management. I'm not sure where manual
management fits into Andrei's scheme. Andrei, could you give an
example of, e.g. how toStringz would work with a stack buffer in
your proposed scheme?

Another thought: if we use a template parameter, what's the story
for virtual functions (e.g. Object.toString)? They can't be
templated.
Andrei Alexandrescu via Digitalmars-d
2014-09-30 12:29:55 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
Agreed. This is the common case we need to solve for, but this is memory
allocation, not management. I'm not sure where manual management fits
into Andrei's scheme. Andrei, could you give an example of, e.g. how
toStringz would work with a stack buffer in your proposed scheme?
There would be no possibility to do that. I mean it's not there but it
can be added e.g. as a "manual" option of performing memory management.
The "manual" overloads for functions would require an output range
parameter. Not all functions might support a "manual" option - that'd be
rejected statically.
Another thought: if we use a template parameter, what's the story for
virtual functions (e.g. Object.toString)? They can't be templated.
Good point. We need to think about that.


Andrei
Jacob Carlborg via Digitalmars-d
2014-09-30 13:18:17 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Good point. We need to think about that.
Weren't all methods in Object supposed to be lifted out from Object anyway?
--
/Jacob Carlborg
Jacob Carlborg via Digitalmars-d
2014-09-30 13:18:17 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Good point. We need to think about that.
Weren't all methods in Object supposed to be lifted out from Object anyway?
--
/Jacob Carlborg
Johannes Pfau via Digitalmars-d
2014-09-30 16:54:54 UTC
Permalink
Am Tue, 30 Sep 2014 05:29:55 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Peter Alexander via Digitalmars-d
Another thought: if we use a template parameter, what's the story
for virtual functions (e.g. Object.toString)? They can't be
templated.
Good point. We need to think about that.
Passing buffers or sink delegates (like we already do for toString) is
possible for some functions. For toString it works fine. Then implement
to!RCString(object) using the toString(sink delegate) overload.

For all other functions RC is indeed difficult, probably only possible
with different manually written overloads (and a dummy parameter as we
can't overload on return type)?
Vladimir Panteleev via Digitalmars-d
2014-09-30 10:47:54 UTC
Permalink
On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I
want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a
temporary stack
buffer your solution doesn't help at all. Passing an ouput
range,
allocator or buffer would all solve this.
I don't understand, why wouldn't you be able to temporarily set
the thread-local allocator to use the stack buffer, and restore
it once done?
Johannes Pfau via Digitalmars-d
2014-09-30 12:02:13 UTC
Permalink
Am Tue, 30 Sep 2014 10:47:54 +0000
Post by Peter Alexander via Digitalmars-d
On Tuesday, 30 September 2014 at 08:34:26 UTC, Johannes Pfau
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a
temporary stack
buffer your solution doesn't help at all. Passing an ouput
range,
allocator or buffer would all solve this.
I don't understand, why wouldn't you be able to temporarily set
the thread-local allocator to use the stack buffer, and restore
it once done?
That's possible but insanely dangerous in case you forget to reset the
thread allocator. Also storing stack pointers in global state (even
thread-local) is dangerous, for example interaction with fibers could
lead to bugs, etc. (What if I set the allocator to a stack allocator
and call a function which yields from a Fiber?).

You also loose all possibilities to use 'scope' or a similar mechanism
to prevent escaping a stack pointer.

Also a stack buffer is not a complete allocator, but in some
cases like toStringz it works even better than allocators (less
overhead as you know the required buffer size before calling toStringz
and there's only one allocation)

And it is a hack. Of course you can provide a wrapper which does
oldAlloc = threadLocalAllocator;
threadLocalAllocator = stackbuf;
func();
scope(exit)
threadLocalAllocator = oldAlloc;

But how could anybody think this is good API design? I think I'd rather
fork the required Phobos functions instead of using such a wrapper.
via Digitalmars-d
2014-09-30 12:32:06 UTC
Permalink
On Tuesday, 30 September 2014 at 12:02:10 UTC, Johannes Pfau
Post by Johannes Pfau via Digitalmars-d
That's possible but insanely dangerous in case you forget to
reset the
thread allocator. Also storing stack pointers in global state
(even
thread-local) is dangerous, for example interaction with fibers could
lead to bugs, etc. (What if I set the allocator to a stack
allocator
and call a function which yields from a Fiber?).
You also loose all possibilities to use 'scope' or a similar
mechanism
to prevent escaping a stack pointer.
Yes, I agree. One option would be to have thread-local region
allocator that can only be used for "scoped" allocation. That is,
only for allocations that are not assigned to globals or can get
stuck in fibers and that are returned to the calling function.
That way the context can free the region when done and you can
get away with little allocation overhead if used prudently.

I also don't agree with the sentiment that allocation/management
can be kept fully separate. If you have a region allocator that
is refcounted it most certainly is interrelated with a fairly
tight coupling.

Also the idea exposed in this thread that release()/retain() is
purely arithmetic and can be optimized as such is quite wrong.
retain() is conceptually a locking construct on a memory region
that prevents reuse. I've made a case for TSX, but one can
probably come up with other multi-threaded examples.

These hacks are not making D more attractive to people who find
C++ lacking in elegance.

Actually, creating a phobos light with nothrow, nogc, a light
runtime and basic building blocks such as intrinsics to build
your own RC with compiler support sounds like a more interesting
option.

I am really not interested in library provided allocators or RC.
If I am not going to use malloc/GC then I want to write my own
and have dedicated allocators for the most common objects.

I think it is quite reasonable that people who want to take the
difficult road of not using GC at all also have to do some extra
work, but provide a clean slate to work from!
Paulo Pinto via Digitalmars-d
2014-09-30 12:51:23 UTC
Permalink
On Tuesday, 30 September 2014 at 12:32:08 UTC, Ola Fosheim
Post by via Digitalmars-d
On Tuesday, 30 September 2014 at 12:02:10 UTC, Johannes Pfau
Post by via Digitalmars-d
...
Also the idea exposed in this thread that release()/retain()
is
purely arithmetic and can be optimized as such is quite wrong.
retain() is conceptually a locking construct on a memory region
that prevents reuse. I've made a case for TSX, but one can
probably come up with other multi-threaded examples.
It works when two big ifs come together.

- inside the same scope (e.g. function level)

- when the referece is not shared between threads.

While it is of limited applicability, Objective-C (and eventually
Swift) codebases prove it helps in most real life use cases.

--
Paulo
via Digitalmars-d
2014-09-30 12:55:39 UTC
Permalink
Post by Paulo Pinto via Digitalmars-d
It works when two big ifs come together.
- inside the same scope (e.g. function level)
- when the referece is not shared between threads.
While it is of limited applicability, Objective-C (and
eventually Swift) codebases prove it helps in most real life
use cases.
But Objective-C has thread safe ref-counting?!

If it isn't thread safe it is of very limited utility, you can
usually get away with unique_ptr in single threaded scenarios.
Paulo Pinto via Digitalmars-d
2014-09-30 20:13:33 UTC
Permalink
Am 30.09.2014 14:55, schrieb "Ola Fosheim GrÞstad"
Post by via Digitalmars-d
Post by Paulo Pinto via Digitalmars-d
It works when two big ifs come together.
- inside the same scope (e.g. function level)
- when the referece is not shared between threads.
While it is of limited applicability, Objective-C (and eventually
Swift) codebases prove it helps in most real life use cases.
But Objective-C has thread safe ref-counting?!
If it isn't thread safe it is of very limited utility, you can usually
get away with unique_ptr in single threaded scenarios.
Did you read my second bullet?
Ola Fosheim Grostad via Digitalmars-d
2014-09-30 20:27:37 UTC
Permalink
Post by Paulo Pinto via Digitalmars-d
Am 30.09.2014 14:55, schrieb "Ola Fosheim GrÞstad"
On Tuesday, 30 September 2014 at 12:51:25 UTC, Paulo Pinto
Post by Paulo Pinto via Digitalmars-d
It works when two big ifs come together.
- inside the same scope (e.g. function level)
- when the referece is not shared between threads.
While it is of limited applicability, Objective-C (and
eventually
Swift) codebases prove it helps in most real life use cases.
But Objective-C has thread safe ref-counting?!
If it isn't thread safe it is of very limited utility, you can usually
get away with unique_ptr in single threaded scenarios.
Did you read my second bullet?
Yes? I dont want builtin rc default for single threaded use
cases. I do want it when references are shared between threads,
e.g. for cache objects.
Mike via Digitalmars-d
2014-09-30 23:01:45 UTC
Permalink
On Tuesday, 30 September 2014 at 12:32:08 UTC, Ola Fosheim
...basic building blocks such as intrinsics to build your own
RC with compiler support sounds like a more interesting option.
I agree.
Andrei Alexandrescu via Digitalmars-d
2014-09-30 12:31:15 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a temporary stack
buffer your solution doesn't help at all. Passing an ouput range,
allocator or buffer would all solve this.
I don't understand, why wouldn't you be able to temporarily set the
thread-local allocator to use the stack buffer, and restore it once done?
That's doable, but you don't get to place the string at a _specific_
buffer. -- Andrei
Andrei Alexandrescu via Digitalmars-d
2014-09-30 12:23:29 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
So you propose RC + global/thread local allocators as the solution for
all memory related problems as 'memory management is not allocation'.
And you claim that using output ranges / providing buffers / allocators
is not an option because it only works in some special cases?
Correct. I assume you meant an irony/sarcasm somewhere :o).
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a temporary stack
buffer your solution doesn't help at all. Passing an ouput range,
allocator or buffer would all solve this.
Correct. The output of toStringz would be either a GC string or an RC
string.


Andrei
Johannes Pfau via Digitalmars-d
2014-09-30 16:49:47 UTC
Permalink
Am Tue, 30 Sep 2014 05:23:29 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Johannes Pfau via Digitalmars-d
So you propose RC + global/thread local allocators as the solution
for all memory related problems as 'memory management is not
allocation'. And you claim that using output ranges / providing
buffers / allocators is not an option because it only works in some
special cases?
Correct. I assume you meant an irony/sarcasm somewhere :o).
The sarcasm is supposed to be here: '_all_ memory related problems' ;-)

I guess my point is that although RC is useful in some cases output
ranges / sink delegates / pre-allocated buffers are still necessary in
other cases and RC is not the solution for _everything_.

As Manu often pointed out sometimes you do not want any dynamic
allocation (toStringz in games is a good example) and here RC doesn't
help.

Another example is format which can already write to output ranges and
uses sink delegates internally. That's a much better abstraction than
simply returning a reference counted string (allocated with a thread
local allocator). Using sink delegates internally is also more
efficient than creating temporary RCStrings. And sometimes there's no
allocation at all this way (directly writing to a socket/file).
Post by Andrei Alexandrescu via Digitalmars-d
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a temporary
stack buffer your solution doesn't help at all. Passing an ouput
range, allocator or buffer would all solve this.
Correct. The output of toStringz would be either a GC string or an RC
string.
But why not provide 3 overloads then?

toStringz(OutputRange)
string toStringz(Policy) //char*, actually
RCString toStringz(Policy)

The notion I got from some of your posts is that you're opposed to such
overloads, or did I misinterpret that?
Sean Kelly via Digitalmars-d
2014-09-30 17:19:27 UTC
Permalink
On Tuesday, 30 September 2014 at 16:49:48 UTC, Johannes Pfau
Post by Johannes Pfau via Digitalmars-d
I guess my point is that although RC is useful in some cases
output
ranges / sink delegates / pre-allocated buffers are still
necessary in
other cases and RC is not the solution for _everything_.
Yes, I'm hoping this is an adjunct to changes in Phobos to reduce
the frequency of implicit allocation in general. The less
garbage that's generated, the less GC vs. RC actually matters.
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:21:44 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
I guess my point is that although RC is useful in some cases output
ranges / sink delegates / pre-allocated buffers are still necessary in
other cases and RC is not the solution for _everything_.
Agreed.
Post by Johannes Pfau via Digitalmars-d
As Manu often pointed out sometimes you do not want any dynamic
allocation (toStringz in games is a good example) and here RC doesn't
help.
Another example is format which can already write to output ranges and
uses sink delegates internally. That's a much better abstraction than
simply returning a reference counted string (allocated with a thread
local allocator). Using sink delegates internally is also more
efficient than creating temporary RCStrings. And sometimes there's no
allocation at all this way (directly writing to a socket/file).
Agreed.
Post by Johannes Pfau via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Johannes Pfau via Digitalmars-d
What if I don't want automated memory _management_? What if I want a
function to use a stack buffer? Or if I want to free manually?
If I want std.string.toStringz to put the result into a temporary
stack buffer your solution doesn't help at all. Passing an ouput
range, allocator or buffer would all solve this.
Correct. The output of toStringz would be either a GC string or an RC
string.
But why not provide 3 overloads then?
toStringz(OutputRange)
string toStringz(Policy) //char*, actually
RCString toStringz(Policy)
The notion I got from some of your posts is that you're opposed to such
overloads, or did I misinterpret that?
I'm not opposed. Here's what I think.

As an approach to using Phobos without a GC, it's been suggested that we
supplement garbage-creating functions with new functions that use output
ranges everywhere, or lazy ranges everywhere.

I think a better approach is to make memory management a policy that
makes convenient use of reference counting possible. So instead of
garbage there'd be reference counted stuff.

Of course, to the extent using lazy computation and/or output ranges is
a good thing to have for various reasons, they remain valid techniques
that are and will continue being used in Phobos.

My point is that acknowledging and systematically using reference
counted types is an essential part of the entire approach.


Andrei
Chris Williams via Digitalmars-d
2014-09-29 18:22:41 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual,
but certain functions will allow passing in a (defaulted)
policy for memory management.
Forcing someone (or rather, a team of someones) to call into the
library in a consistent fashion like this seems like a rather
risky venture. I suppose that you could add some special compiler
checks to make sure that people are being consistent, but I'd
probably rather see some way of templating modules so that the
chances for human error are reduced.

--- foo.d ---
module std.foo(GC = gc);

void bar() {
static if (gc) {
...
}
}

--- usercode.d ---
import std.foo!rc;

void fooCaller() {
bar();
}

Though truthfully, I'd rather it be a compiler flag. But I
presume that there's an issue with that, which it is too early
for my brain to think of.
Shammah Chancellor via Digitalmars-d
2014-09-29 18:44:11 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have a larger
strategy in mind. Here it is.
The basic tenet of the approach is to reckon and act on the fact that
memory allocation (the subject of allocators) is an entirely distinct
topic from memory management, and more generally resource management.
This clarifies that it would be wrong to approach alternatives to GC in
Phobos by means of allocators. GC is not only an approach to memory
allocation, but also an approach to memory management. Reducing it to
either one is a mistake. In hindsight this looks rather obvious but it
has caused me and many people better than myself a lot of headache.
That said allocators are nice to have and use, and I will definitely
follow up with std.allocator. However, std.allocator is not the key to
Nor are ranges. There is an attitude that either output ranges, or
input ranges in conjunction with lazy computation, would solve the
issue of creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a good
illustration of the latter approach: a range would be lazily created by
chaining stuff together. A range-based approach would take us further
(a) the whole approach doesn't stand scrutiny for non-linear outputs,
e.g. outputting some sort of associative array or really any composite
type quickly becomes tenuous either with an output range (eager) or
with exposing an input range (lazy);
(b) makes the style of programming without GC radically different, and
much more cumbersome, than programming with GC; as a consequence,
programmers who consider changing one approach to another, or
implementing an algorithm neutral to it, are looking at a major rewrite;
character; technically, I have long gotten used to seeing most
elaborate C++ code like poor emulation of simple D idioms. But C++ has
spent years and decades taking to perfection an approach without a
tracing garbage collector. A departure from that would need to be
superior, and that doesn't seem to be the case with range-based
approaches.
===========
Now that we clarified that these existing attempts are not going to
work well, the question remains what does. For Phobos I'm thinking of
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC, i.e. the GC
will still be able to pick up cycles and other kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make the
distinction dynamically, at runtime. I'm distinguishing them statically
here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual, but
certain functions will allow passing in a (defaulted) policy for memory
management.
Destroy!
Andrei
I don't like the idea of having to pass in template parameters
everywhere -- even for allocators. Is there some way we could have
"allocator contexts"?

E.G.

with( auto allocator = ReferencedCounted() )
{
auto foo = setExtension("hello", "txt");
}

ReferenceCounted() could replace a thread-local "new" delegate with
something it has, and when it goes out of scope, it would reset it to
whatever it was before. This would create some runtime overhead --
but I'm not sure how much more than already exists.

-Shammah
Andrei Alexandrescu via Digitalmars-d
2014-09-29 22:15:33 UTC
Permalink
Post by Shammah Chancellor via Digitalmars-d
I don't like the idea of having to pass in template parameters
everywhere -- even for allocators.
I agree.
Post by Shammah Chancellor via Digitalmars-d
Is there some way we could have
"allocator contexts"?
E.G.
with( auto allocator = ReferencedCounted() )
Don't confuse memory allocation with memory management. There's no such
a thing as a "reference counted allocator".


Andrei
Shammah Chancellor via Digitalmars-d
2014-09-30 00:01:28 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Shammah Chancellor via Digitalmars-d
I don't like the idea of having to pass in template parameters
everywhere -- even for allocators.
I agree.
Post by Shammah Chancellor via Digitalmars-d
Is there some way we could have
"allocator contexts"?
E.G.
with( auto allocator = ReferencedCounted() )
Don't confuse memory allocation with memory management. There's no such
a thing as a "reference counted allocator".
Andrei
Sure, but combining the two could be very useful -- as we have noticed
with a allocators that work off of a garbage collector. With regards
to reference counting, you could implement one that automatically wraps
the type in an RC struct and proxies them. Being able to redefined
aliases during different sections of compilation would be required
though.
Daniel N via Digitalmars-d
2014-09-30 02:22:43 UTC
Permalink
On Monday, 29 September 2014 at 22:15:32 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Shammah Chancellor via Digitalmars-d
I don't like the idea of having to pass in template parameters
everywhere -- even for allocators.
I agree.
There was a solution earlier in this thread which avoids that
problem. When a function is annotated with @nogc there's
sufficient info to chose the correct implementation without any
parameters, it's already known whether we are instantiated from a
@nogc block or not.
Andrei Alexandrescu via Digitalmars-d
2014-10-01 08:58:51 UTC
Permalink
Post by Shammah Chancellor via Digitalmars-d
I don't like the idea of having to pass in template parameters
everywhere -- even for allocators. Is there some way we could have
"allocator contexts"?
E.G.
with( auto allocator = ReferencedCounted() )
{
auto foo = setExtension("hello", "txt");
}
ReferenceCounted() could replace a thread-local "new" delegate with
something it has, and when it goes out of scope, it would reset it to
whatever it was before. This would create some runtime overhead -- but
I'm not sure how much more than already exists.
I'm not sure whether we can do this within D's type system. -- Andrei
Uranuz via Digitalmars-d
2014-09-29 20:07:40 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual,
but certain functions will allow passing in a (defaulted)
policy for memory management.
Destroy!
I'll try to destroy ;) Before thinking out some answers to this
problem let me ask a little more questions.

1. As far as I understand allocation and memory management of
entities like class (Object), dynamic arrays and associative
arrays is part of language/ runtime. What is proposed here is
*fix* to standart library. But that allocation and MM happening
via GC is not *fault* of standart library but is predefined
behaviour of D lang itself and it's runtime. The standard library
becomes a `hostage` of runtime library in this situation. Do you
really sure that we should "fix" standart library in that way?
For me it looks like implementing struts for standard lib (which
is not broken yet ;) ) in order to compensate behaviour of
runtime lib.

2. Second question is slightly oftopic, but I still want put it
there. What I dislike about ranges and standart library is that
it's hard to understand what is the returned value of library
function. I have some *pedals* (front, popFront) to push and do
some magic. Of course it was made for purpose of making universal
algorithms. But the mor I use ranges, *auto* then less I believe
that I use static-typed language. What is wanted to make code
clear is having distinct variable declaration with specification
of it's type. With all of these auto's logic of programme becomes
unclear, because data structures are unclear. So I came to the
question: is the memory management or allocation policy
syntacticaly part of declaration or is it a inner implementation
detail that should not be shown in decl?

Should rc and gc string look simillar or not?

string str1 = makeGCString("test");
string str2 = makeRCString("test");

// --- vs ---

GCString str1 = "test";
RCString str2 = "test";

// --- or ---

String!GC str1 = "test";
String!RC str2 = "test";

// --- or even ---
@gc string str1 = "test";
@rc string str2 = "test";

As far as I understand currently we will have:
string str1 = "test";
RCString str2 = "test";

So another question is why the same object "string" is
implemented as different types. Array and struct (class)?

3. Should algorithms based on range interface care about
allocation? Range is about iteration and access to elements but
not about allocation and memory mangement.

I would like to have attributes @rc, @gc (or like these) to
switch MM-policy versus *String!RC* or *RCString* but we cannot
apply attributes to literal. Passing to allgorithm something like
this:

find( @rc "test", @rc "t" )

is syntactically incorrect. But we can use this form:

find( RCString("test"), RCString("t") )

But above form is more verbose. As continuation of this question
I have next question.

4. How to deal with literals? How to make them ref-counted?

I ask this because even when writing RCString("test")
syntactically expression "test" is still GC-managed literal. I
pass GC-managed literal into struct to make it RC-managed. Why
just not make it RC from the start?

Adding some additional template parameter to algrorithm wil not
fix this. It is a problem of D itself and it's runtime library.


So I assume that std lib is not broken this way and we should not
try to fix it this way. Thanks for attention.
Mike via Digitalmars-d
2014-09-30 04:18:11 UTC
Permalink
Post by Uranuz via Digitalmars-d
1. As far as I understand allocation and memory management of
entities like class (Object), dynamic arrays and associative
arrays is part of language/ runtime. What is proposed here is
*fix* to standart library. But that allocation and MM happening
via GC is not *fault* of standart library but is predefined
behaviour of D lang itself and it's runtime. The standard
library
becomes a `hostage` of runtime library in this situation. Do you
really sure that we should "fix" standart library in that way?
For me it looks like implementing struts for standard lib (which
is not broken yet ;) ) in order to compensate behaviour of
runtime lib.
This really hits the nail on the head, and I think your other
comments and questions are also quite insightful.

IMO the proposal that started this thread, @nogc, and -vgc are
all beating around the bush rather than addressing the
fundamental problem.

Mike
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:07:31 UTC
Permalink
Post by Uranuz via Digitalmars-d
1. As far as I understand allocation and memory management of
entities like class (Object), dynamic arrays and associative
arrays is part of language/ runtime. What is proposed here is
*fix* to standart library. But that allocation and MM happening
via GC is not *fault* of standart library but is predefined
behaviour of D lang itself and it's runtime. The standard library
becomes a `hostage` of runtime library in this situation. Do you
really sure that we should "fix" standart library in that way?
For me it looks like implementing struts for standard lib (which
is not broken yet ;) ) in order to compensate behaviour of
runtime lib.
The change will be to both the runtime and the standard library.
Post by Uranuz via Digitalmars-d
2. Second question is slightly oftopic, but I still want put it
there. What I dislike about ranges and standart library is that
it's hard to understand what is the returned value of library
function. I have some *pedals* (front, popFront) to push and do
some magic. Of course it was made for purpose of making universal
algorithms. But the mor I use ranges, *auto* then less I believe
that I use static-typed language. What is wanted to make code
clear is having distinct variable declaration with specification
of it's type. With all of these auto's logic of programme becomes
unclear, because data structures are unclear. So I came to the
question: is the memory management or allocation policy
syntacticaly part of declaration or is it a inner implementation
detail that should not be shown in decl?
Sadly this is the way things are going (not only in D, but other
languages such as C++, Haskell, Scala, etc). Type proliferation has
costs, but also a ton of benefits.

Most often the memory management policy will be part of function
signatures because it affects data type definitions.
Post by Uranuz via Digitalmars-d
Should rc and gc string look simillar or not?
string str1 = makeGCString("test");
string str2 = makeRCString("test");
// --- vs ---
GCString str1 = "test";
RCString str2 = "test";
// --- or ---
String!GC str1 = "test";
String!RC str2 = "test";
// --- or even ---
@gc string str1 = "test";
@rc string str2 = "test";
string str1 = "test";
RCString str2 = "test";
Per Sean's idea things would go GC.string vs. RC.string, where GC and RC
are two memory management policies (simple structs defining aliases and
probably a few primitives).
Post by Uranuz via Digitalmars-d
So another question is why the same object "string" is
implemented as different types. Array and struct (class)?
A reference counted string has a different layout than immutable(char)[].
Post by Uranuz via Digitalmars-d
3. Should algorithms based on range interface care about
allocation? Range is about iteration and access to elements but
not about allocation and memory mangement.
Most don't.
Post by Uranuz via Digitalmars-d
switch MM-policy versus *String!RC* or *RCString* but we cannot
apply attributes to literal. Passing to allgorithm something like
find( RCString("test"), RCString("t") )
But above form is more verbose. As continuation of this question
I have next question.
If language changes are necessary, we will make language changes. I'm
trying first to explore solutions within the language.
Post by Uranuz via Digitalmars-d
4. How to deal with literals? How to make them ref-counted?
I don't know yet.
Post by Uranuz via Digitalmars-d
I ask this because even when writing RCString("test")
syntactically expression "test" is still GC-managed literal. I
pass GC-managed literal into struct to make it RC-managed. Why
just not make it RC from the start?
Adding some additional template parameter to algrorithm wil not
fix this. It is a problem of D itself and it's runtime library.
I understand. The problem is actually worse with array literals, which
are silently dynamically allocated on the garbage-collected heap:

auto s = "hello"; // at least there's no allocation
auto a = [1, 2, 3]; // dynamic allocation

A language-based solution would change array literal syntax. A
library-based solution would leave array literals with today's syntax
and semantics and offer a controlled alternative a la:

auto a = MyMemPolicy.array(1, 2, 3); // cool
Post by Uranuz via Digitalmars-d
So I assume that std lib is not broken this way and we should not
try to fix it this way. Thanks for attention.
And thanks for your great points.


Andrei
Freddy via Digitalmars-d
2014-09-29 22:11:35 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have
a larger strategy in mind. Here it is.
The basic tenet of the approach is to reckon and act on the
fact that memory allocation (the subject of allocators) is an
entirely distinct topic from memory management, and more
generally resource management. This clarifies that it would be
wrong to approach alternatives to GC in Phobos by means of
allocators. GC is not only an approach to memory allocation,
but also an approach to memory management. Reducing it to
either one is a mistake. In hindsight this looks rather obvious
but it has caused me and many people better than myself a lot
of headache.
That said allocators are nice to have and use, and I will
definitely follow up with std.allocator. However, std.allocator
Nor are ranges. There is an attitude that either output ranges,
or input ranges in conjunction with lazy computation, would
solve the issue of creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a
good illustration of the latter approach: a range would be
lazily created by chaining stuff together. A range-based
approach would take us further than the allocators, but I see
(a) the whole approach doesn't stand scrutiny for non-linear
outputs, e.g. outputting some sort of associative array or
really any composite type quickly becomes tenuous either with
an output range (eager) or with exposing an input range (lazy);
(b) makes the style of programming without GC radically
different, and much more cumbersome, than programming with GC;
as a consequence, programmers who consider changing one
approach to another, or implementing an algorithm neutral to
it, are looking at a major rewrite;
of character; technically, I have long gotten used to seeing
most elaborate C++ code like poor emulation of simple D idioms.
But C++ has spent years and decades taking to perfection an
approach without a tracing garbage collector. A departure from
that would need to be superior, and that doesn't seem to be the
case with range-based approaches.
===========
Now that we clarified that these existing attempts are not
going to work well, the question remains what does. For Phobos
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC,
i.e. the GC will still be able to pick up cycles and other
kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make
the distinction dynamically, at runtime. I'm distinguishing
them statically here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual,
but certain functions will allow passing in a (defaulted)
policy for memory management.
Destroy!
Andrei
Internally we should have something like:

---
template String(MemoryManagementPolicy mmp=gc){
/++ ... +/
}
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
auto result=String!mmp();
/++ +/
}
----

or maybe even allowing user types in the template argument(the
original purpose of templates)

---
auto setExtension(String = string, R1, R2)(R1
path, R2){
/++ +/
}
----
Andrei Alexandrescu via Digitalmars-d
2014-09-29 22:16:13 UTC
Permalink
Post by Freddy via Digitalmars-d
---
template String(MemoryManagementPolicy mmp=gc){
/++ ... +/
}
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
auto result=String!mmp();
/++ +/
}
----
or maybe even allowing user types in the template argument(the
original purpose of templates)
---
auto setExtension(String = string, R1, R2)(R1
path, R2){
/++ +/
}
That's correct. -- Andrei
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:08:31 UTC
Permalink
Post by Freddy via Digitalmars-d
---
template String(MemoryManagementPolicy mmp=gc){
/++ ... +/
}
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
auto result=String!mmp();
/++ +/
}
----
or maybe even allowing user types in the template argument(the
original purpose of templates)
---
auto setExtension(String = string, R1, R2)(R1
path, R2){
/++ +/
}
----
Good idea, and it seems Sean's is even better because it groups
everything related to memory management where it belongs - in the memory
management policy. -- Andrei
Foo via Digitalmars-d
2014-09-30 13:38:42 UTC
Permalink
I hate the fact that this will produce template bloat for each
function/method.
I'm also in favor of "let the user pick", but I would use a
global variable:

----
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;

auto RMP = gc;
----

and in my code:

----
RMP = rc;
string str = "foo"; // compiler knows -> ref counted
// ...
RMP = gc;
string str2 = "bar"; // normal behaviour restored
----
Foo via Digitalmars-d
2014-09-30 13:39:51 UTC
Permalink
Post by Foo via Digitalmars-d
I hate the fact that this will produce template bloat for each
function/method.
I'm also in favor of "let the user pick", but I would use a
----
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
auto RMP = gc;
----
----
RMP = rc;
string str = "foo"; // compiler knows -> ref counted
// ...
RMP = gc;
string str2 = "bar"; // normal behaviour restored
----
Of course each method/function in Phobos should use the global
RMP.
Andrei Alexandrescu via Digitalmars-d
2014-09-30 13:59:25 UTC
Permalink
Post by Foo via Digitalmars-d
I hate the fact that this will produce template bloat for each
function/method.
I'm also in favor of "let the user pick", but I would use a global
----
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
auto RMP = gc;
----
----
RMP = rc;
string str = "foo"; // compiler knows -> ref counted
// ...
RMP = gc;
string str2 = "bar"; // normal behaviour restored
----
This won't work because the type of "string" is different for RC vs. GC.
-- Andrei
Foo via Digitalmars-d
2014-09-30 14:05:42 UTC
Permalink
On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei
Post by Andrei Alexandrescu via Digitalmars-d
Post by Foo via Digitalmars-d
I hate the fact that this will produce template bloat for each
function/method.
I'm also in favor of "let the user pick", but I would use a
global
----
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
auto RMP = gc;
----
----
RMP = rc;
string str = "foo"; // compiler knows -> ref counted
// ...
RMP = gc;
string str2 = "bar"; // normal behaviour restored
----
This won't work because the type of "string" is different for
RC vs. GC. -- Andrei
But it would work for phobos functions without template bloat.
via Digitalmars-d
2014-09-30 14:13:36 UTC
Permalink
Post by Foo via Digitalmars-d
On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei
Post by Andrei Alexandrescu via Digitalmars-d
This won't work because the type of "string" is different for
RC vs. GC. -- Andrei
But it would work for phobos functions without template bloat.
Only for internal allocations. If the functions want to return
something, the type must known.
Andrei Alexandrescu via Digitalmars-d
2014-09-30 15:33:02 UTC
Permalink
Post by via Digitalmars-d
Post by Foo via Digitalmars-d
On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei
Post by Andrei Alexandrescu via Digitalmars-d
This won't work because the type of "string" is different for RC vs.
GC. -- Andrei
But it would work for phobos functions without template bloat.
Only for internal allocations. If the functions want to return
something, the type must known.
Ah, now I understand the point. Thanks. -- Andrei
Andrei Alexandrescu via Digitalmars-d
2014-09-30 15:32:05 UTC
Permalink
Post by Foo via Digitalmars-d
On Tuesday, 30 September 2014 at 13:59:23 UTC, Andrei
Post by Andrei Alexandrescu via Digitalmars-d
Post by Foo via Digitalmars-d
I hate the fact that this will produce template bloat for each
function/method.
I'm also in favor of "let the user pick", but I would use a global
----
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
auto RMP = gc;
----
----
RMP = rc;
string str = "foo"; // compiler knows -> ref counted
// ...
RMP = gc;
string str2 = "bar"; // normal behaviour restored
----
This won't work because the type of "string" is different for RC vs.
GC. -- Andrei
But it would work for phobos functions without template bloat.
How is the fact there's less bloat relevant for code that doesn't work?
I.e. it doesn't compile. It needs to return string for GC and RCString
for RC.

Andrei
John Colvin via Digitalmars-d
2014-09-30 14:07:31 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have
a larger strategy in mind. Here it is.
The basic tenet of the approach is to reckon and act on the
fact that memory allocation (the subject of allocators) is an
entirely distinct topic from memory management, and more
generally resource management. This clarifies that it would be
wrong to approach alternatives to GC in Phobos by means of
allocators. GC is not only an approach to memory allocation,
but also an approach to memory management. Reducing it to
either one is a mistake. In hindsight this looks rather obvious
but it has caused me and many people better than myself a lot
of headache.
That said allocators are nice to have and use, and I will
definitely follow up with std.allocator. However, std.allocator
Nor are ranges. There is an attitude that either output ranges,
or input ranges in conjunction with lazy computation, would
solve the issue of creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a
good illustration of the latter approach: a range would be
lazily created by chaining stuff together. A range-based
approach would take us further than the allocators, but I see
(a) the whole approach doesn't stand scrutiny for non-linear
outputs, e.g. outputting some sort of associative array or
really any composite type quickly becomes tenuous either with
an output range (eager) or with exposing an input range (lazy);
(b) makes the style of programming without GC radically
different, and much more cumbersome, than programming with GC;
as a consequence, programmers who consider changing one
approach to another, or implementing an algorithm neutral to
it, are looking at a major rewrite;
of character; technically, I have long gotten used to seeing
most elaborate C++ code like poor emulation of simple D idioms.
But C++ has spent years and decades taking to perfection an
approach without a tracing garbage collector. A departure from
that would need to be superior, and that doesn't seem to be the
case with range-based approaches.
===========
Now that we clarified that these existing attempts are not
going to work well, the question remains what does. For Phobos
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC,
i.e. the GC will still be able to pick up cycles and other
kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make
the distinction dynamically, at runtime. I'm distinguishing
them statically here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual,
but certain functions will allow passing in a (defaulted)
policy for memory management.
Destroy!
Andrei
Instead of adding a new template parameter to every function
(which won't necessarily play nicely with existing IFTI and
variadic templates), why not allow template modules?

import stringRC = std.string!rc;
import stringGC = std.string!gc;


// in std/string.d
module std.string(MemoryManagementPolicy mmp)

pure @trusted S capitalize(S)(S s)
if (isSomeString!S)
{
//...

static if(mmp == MemoryManagementPolicy.gc)
{
//...
}
else static if .......
}
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:14:33 UTC
Permalink
Instead of adding a new template parameter to every function (which
won't necessarily play nicely with existing IFTI and variadic
templates), why not allow template modules?
Nice idea, but let's try and explore possibilities within the existing
rich language. If a need for new language features arises, I trust we'll
see it. -- Andrei
Sean Kelly via Digitalmars-d
2014-09-30 16:10:43 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Is this for exposition purposes or actually how you expect it to
work? Quite honestly, I can't imagine how I could write a
template function in D that needs to work with this approach.

As much as I hate to say it, this is pretty much exactly what C++
allocators were designed for. They handle allocation, sure, but
they also hold aliases for all relevant types for the data being
allocated. If the MemoryManagementPolicy enum were replaced with
an alias to a type that I could use to at least obtain relevant
aliases, that would be something. But even that approach
dramatically complicates code that uses it.

Having written standards-compliant containers in C++, I honestly
can't imagine the average user writing code that works this way.
Once you assert that the reference type may be a pointer or it
may be some complex proxy to data stored elsewhere, a lot of
composability pretty much flies right out the window.

For example, I have an implementation of C++
unordered_map/set/etc designed to be a customizable cache, so one
of its template arguments is a policy type that allows eviction
behavior to be chosen at declaration time. Maybe the cache is
size-limited, maybe it's age-limited, maybe it's a combination of
the two or something even more complicated. The problem is that
the container defines all the aliases relating to the underlying
data, but the policy, which needs to be aware of these, is passed
as a template argument to this container.

To make something that's fully aware of C++ allocators then, I'd
have to define a small type that takes the container template
arguments (the contained type and the allocator type) and
generates the aliases and pass this to the policy, which in turn
passes the type through to the underlying container so it can
declare its public aliases and whatever else is true
standards-compliant fashion (or let the container derive this
itself, but then you run into the potential for disagreement).
And while this is possible, doing so would complicate the
creation of the cache policies to the point where it subverts
their intent, which was to make it easy for the user to tune the
behavior of the cache to their own particular needs by defining a
simple type which implements a few functions. Ultimately, I
decided against this approach for the cache container and decided
to restrict the allocators to those which defined a pointer to T
as T* so the policies could be coded with basically no knowledge
of the underlying storage.

So... while I support the goal you're aiming at, I want to see a
much more comprehensive example of how this will work and how it
will affect code written by D *users*. Because it isn't enough
for Phobos to be written this way. Basically all D code will
have to take this into account for the strategy to be truly
viable. Simply outlining one of the most basic functions in
Phobos, which already looks like it will have a static
conditional at the beginning and *need to be aware of the fact
that an RCString type exists* makes me terrified of what a
realistic example will look like.
H. S. Teoh via Digitalmars-d
2014-09-30 17:33:02 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
The policy is a template parameter to functions in Phobos (and
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Is this for exposition purposes or actually how you expect it to work?
Quite honestly, I can't imagine how I could write a template function
in D that needs to work with this approach.
As much as I hate to say it, this is pretty much exactly what C++
allocators were designed for. They handle allocation, sure, but they
also hold aliases for all relevant types for the data being allocated.
[...]
So... while I support the goal you're aiming at, I want to see a much
more comprehensive example of how this will work and how it will
affect code written by D *users*. Because it isn't enough for Phobos
to be written this way. Basically all D code will have to take this
into account for the strategy to be truly viable. Simply outlining
one of the most basic functions in Phobos, which already looks like it
will have a static conditional at the beginning and *need to be aware
of the fact that an RCString type exists* makes me terrified of what a
realistic example will look like.
Yeah, this echoes my concern. This looks not that much different, from a
user's POV, from C++ containers' allocator template parameters. Yes I
know we're not talking about *allocators* per se but about *memory
management*, but I'm talking about the need to explicitly pass mmp to
*every* *single* *function* if you desire anything but the default. How
many people actually *use* the allocator parameter in STL? Certainly,
many people do... but the code is anything but readable / maintainable.

Not only that, but every single function will have to handle this
parameter somehow, and if static if's at the top of the function is what
we're starting with, I fear seeing what we end up with.

Furthermore, in order for this to actually work, it has to be percolated
throughout the entire codebase -- any D library that even remotely uses
Phobos for anything will have to percolate this parameter throughout its
API -- at least, any part of the API that might potentially use a Phobos
function. Otherwise, you still have the situation where a given D
library doesn't allow the user to select a memory management scheme, and
internally calls Phobos functions with the default settings. So this
still doesn't solve the problem that today, people who need to use @nogc
can't use a lot of existing libraries because the library depends on the
GC, even if it doesn't assume anything about the MM scheme, but just
happens to call some obscure Phobos function with the default MM
parameter. The only way this could work was if *every* D library author
voluntarily rewrites a lot of code in order to percolate this MM
parameter through to the API, on the off-chance that some obscure user
somewhere might have need to use it. I don't see much likelihood of this
actually happening.

Then there's the matter of functions like parseJSON() that needs to
allocate nodes and return a tree (or whatever) of these nodes. Note that
they need to *allocate*, not just know what kind of memory management
model is to be used. So how do you propose to address this? Via another
parameter (compile-time or otherwise) to specify which allocator to use?
So how does the memory management parameter solve anything then? And how
would such a thing be implemented? Using a 3-way static-if branch in
every single point in parseJSON where it needs to allocate nodes? We
could just as well write it in C++, if that's the case.

This proposal has many glaring holes that need to be fixed before it can
be viable.


T
--
EMACS = Extremely Massive And Cumbersome System
Andrei Alexandrescu via Digitalmars-d
2014-10-01 08:55:58 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
The policy is a template parameter to functions in Phobos (and
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Is this for exposition purposes or actually how you expect it to work?
That's pretty much what it would take. The key here is that RCString is
almost a drop-in replacement for string, so the code using it is almost
identical. There will be places where code needs to be replaced, e.g.

auto s = "literal";

would need to become

S s = "literal";

So creation of strings will change a bit, but overall there's not a lot
of churn.
Quite honestly, I can't imagine how I could write a template function
in D that needs to work with this approach.
You mean write a function that accepts a memory management policy, or a
function that uses one?
As much as I hate to say it, this is pretty much exactly what C++
allocators were designed for. They handle allocation, sure, but they
also hold aliases for all relevant types for the data being allocated.
If the MemoryManagementPolicy enum were replaced with an alias to a type
that I could use to at least obtain relevant aliases, that would be
something. But even that approach dramatically complicates code that
uses it.
I think making MemoryManagementPolicy a meaningful type is a great idea.
It would e.g. define the string type, so the code becomes:

auto setExtension(alias MemoryManagementPolicy = gc, R1, R2)(R1 path, R2
ext)
if (...)
{
MemoryManagementPolicy.string result;
...
return result;
}

This is a lot more general and extensible. Thanks!

Why do you think there'd be dramatic complication of code? (Granted, at
some point we must acknowledge that some egg breaking is necessary for
the proverbial omelette.)
Having written standards-compliant containers in C++, I honestly can't
imagine the average user writing code that works this way. Once you
assert that the reference type may be a pointer or it may be some
complex proxy to data stored elsewhere, a lot of composability pretty
much flies right out the window.
The thing is, again, we must make some changes if we want D to be usable
without a GC. One of them is e.g. to not allocate built-in slices all
over the place.
For example, I have an implementation of C++ unordered_map/set/etc
designed to be a customizable cache, so one of its template arguments is
a policy type that allows eviction behavior to be chosen at declaration
time. Maybe the cache is size-limited, maybe it's age-limited, maybe
it's a combination of the two or something even more complicated. The
problem is that the container defines all the aliases relating to the
underlying data, but the policy, which needs to be aware of these, is
passed as a template argument to this container.
To make something that's fully aware of C++ allocators then, I'd have to
define a small type that takes the container template arguments (the
contained type and the allocator type) and generates the aliases and
pass this to the policy, which in turn passes the type through to the
underlying container so it can declare its public aliases and whatever
else is true standards-compliant fashion (or let the container derive
this itself, but then you run into the potential for disagreement). And
while this is possible, doing so would complicate the creation of the
cache policies to the point where it subverts their intent, which was to
make it easy for the user to tune the behavior of the cache to their own
particular needs by defining a simple type which implements a few
functions. Ultimately, I decided against this approach for the cache
container and decided to restrict the allocators to those which defined
a pointer to T as T* so the policies could be coded with basically no
knowledge of the underlying storage.
That sounds like a rather involved artifact. Hopefully we can leverage
D's better expressiveness to make building such complex libraries easier.
So... while I support the goal you're aiming at, I want to see a much
more comprehensive example of how this will work and how it will affect
code written by D *users*.
Agreed.
Because it isn't enough for Phobos to be
written this way. Basically all D code will have to take this into
account for the strategy to be truly viable. Simply outlining one of
the most basic functions in Phobos, which already looks like it will
have a static conditional at the beginning and *need to be aware of the
fact that an RCString type exists* makes me terrified of what a
realistic example will look like.
That would be overreacting :o).


Andrei
Sean Kelly via Digitalmars-d
2014-10-01 13:52:33 UTC
Permalink
On Wednesday, 1 October 2014 at 08:55:55 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Sean Kelly via Digitalmars-d
Is this for exposition purposes or actually how you expect it
to work?
That's pretty much what it would take. The key here is that
RCString is almost a drop-in replacement for string, so the
code using it is almost identical. There will be places where
code needs to be replaced, e.g.
auto s = "literal";
would need to become
S s = "literal";
So creation of strings will change a bit, but overall there's
not a lot of churn.
I'm confused. Is this a general-purpose solution or just one
that switches between string and RCString?
Andrei Alexandrescu via Digitalmars-d
2014-10-01 16:59:09 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Is this for exposition purposes or actually how you expect it to work?
That's pretty much what it would take. The key here is that RCString
is almost a drop-in replacement for string, so the code using it is
almost identical. There will be places where code needs to be
replaced, e.g.
auto s = "literal";
would need to become
S s = "literal";
So creation of strings will change a bit, but overall there's not a
lot of churn.
I'm confused. Is this a general-purpose solution or just one that
switches between string and RCString?
General purpose since your suggested change. -- Andrei
Sean Kelly via Digitalmars-d
2014-10-01 14:03:43 UTC
Permalink
On Wednesday, 1 October 2014 at 08:55:55 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Quite honestly, I can't imagine how I could write a template
function in D that needs to work with this approach.
You mean write a function that accepts a memory management
policy, or a function that uses one?
Both, I suppose? A static if block at the top of each function
that must be aware of every RC type the user may expect? What if
it's a user-defined RC type and this function is in Phobos?
Post by Andrei Alexandrescu via Digitalmars-d
As much as I hate to say it, this is pretty much exactly what
C++
allocators were designed for. They handle allocation, sure,
but they
also hold aliases for all relevant types for the data being
allocated.
If the MemoryManagementPolicy enum were replaced with an alias
to a type that I could use to at least obtain relevant
aliases, that would be something. But even that approach
dramatically complicates code that uses it.
I think making MemoryManagementPolicy a meaningful type is a
great idea. It would e.g. define the string type, so the code
auto setExtension(alias MemoryManagementPolicy = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
MemoryManagementPolicy.string result;
...
return result;
}
This is a lot more general and extensible. Thanks!
Why do you think there'd be dramatic complication of code?
(Granted, at some point we must acknowledge that some egg
breaking is necessary for the proverbial omelette.)
From my experience with C++ containers. Having an alias for a
type is okay, but bank of aliases where one is a pointer to the
type, one is a const pointer to the type, etc, makes writing the
involved code feel really unnatural.
Post by Andrei Alexandrescu via Digitalmars-d
The thing is, again, we must make some changes if we want D to
be usable without a GC. One of them is e.g. to not allocate
built-in slices all over the place.
So let the user supply a scratch buffer that will hold the
result? With the RC approach we're still allocating, they just
aren't built-in slices, correct?
Post by Andrei Alexandrescu via Digitalmars-d
That would be overreacting :o).
I hope it is :-)
Andrei Alexandrescu via Digitalmars-d
2014-10-01 17:00:44 UTC
Permalink
So let the user supply a scratch buffer that will hold the result? With
the RC approach we're still allocating, they just aren't built-in
slices, correct?
Correct. -- Andrei
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:50:31 UTC
Permalink
Post by H. S. Teoh via Digitalmars-d
Yeah, this echoes my concern. This looks not that much different, from a
user's POV, from C++ containers' allocator template parameters. Yes I
know we're not talking about*allocators* per se but about *memory
management*, but I'm talking about the need to explicitly pass mmp to
*every* *single* *function* if you desire anything but the default. How
many people actually*use* the allocator parameter in STL? Certainly,
many people do... but the code is anything but readable / maintainable.
The parallel with STL allocators is interesting, but I'm not worried
about it that much. I don't want to go off on a tangent but I'm fairly
certain std::allocator is hard to use for entirely different reasons
than the intended use patterns of MemoryManagementPolicy.
Post by H. S. Teoh via Digitalmars-d
Not only that, but every single function will have to handle this
parameter somehow, and if static if's at the top of the function is what
we're starting with, I fear seeing what we end up with.
Apparently Sean's idea would take care of that.
Post by H. S. Teoh via Digitalmars-d
Furthermore, in order for this to actually work, it has to be percolated
throughout the entire codebase -- any D library that even remotely uses
Phobos for anything will have to percolate this parameter throughout its
API -- at least, any part of the API that might potentially use a Phobos
function.
Yes, but that's entirely expected. We're adding genuinely new
functionality to Phobos.
Post by H. S. Teoh via Digitalmars-d
Otherwise, you still have the situation where a given D
library doesn't allow the user to select a memory management scheme, and
internally calls Phobos functions with the default settings.
Correct.
Post by H. S. Teoh via Digitalmars-d
So this
can't use a lot of existing libraries because the library depends on the
GC, even if it doesn't assume anything about the MM scheme, but just
happens to call some obscure Phobos function with the default MM
parameter. The only way this could work was if*every* D library author
voluntarily rewrites a lot of code in order to percolate this MM
parameter through to the API, on the off-chance that some obscure user
somewhere might have need to use it. I don't see much likelihood of this
actually happening.
A simple way to put this is Libraries that use the GC will continue to
use the GC. There's no way around that unless we choose to break them all.
Post by H. S. Teoh via Digitalmars-d
Then there's the matter of functions like parseJSON() that needs to
allocate nodes and return a tree (or whatever) of these nodes. Note that
they need to*allocate*, not just know what kind of memory management
model is to be used. So how do you propose to address this? Via another
parameter (compile-time or otherwise) to specify which allocator to use?
So how does the memory management parameter solve anything then? And how
would such a thing be implemented? Using a 3-way static-if branch in
every single point in parseJSON where it needs to allocate nodes? We
could just as well write it in C++, if that's the case.
parseJSON() would get a memory management policy parameter, and will use
the currently installed memory allocator for allocation.
Post by H. S. Teoh via Digitalmars-d
This proposal has many glaring holes that need to be fixed before it can
be viable.
Affirmative. That's why it's an RFC, very far from a proposal. I'm glad
I got a bunch of good ideas.


Andrei
Dmitry Olshansky via Digitalmars-d
2014-09-30 18:06:28 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Incredible code bloat? Boilerplate in each function for the win?
I'm at loss as to how it would make things better.
--
Dmitry Olshansky
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:51:08 UTC
Permalink
Post by Dmitry Olshansky via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Incredible code bloat? Boilerplate in each function for the win?
I'm at loss as to how it would make things better.
Sean's idea to make string an alias of the policy takes care of this
concern. -- Andrei
H. S. Teoh via Digitalmars-d
2014-10-01 17:51:44 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dmitry Olshansky via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1 path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Incredible code bloat? Boilerplate in each function for the win?
I'm at loss as to how it would make things better.
Sean's idea to make string an alias of the policy takes care of this
concern. -- Andrei
But Sean's idea only takes strings into account. Strings aren't the only
allocated resource Phobos needs to deal with. So extrapolating from that
idea, each memory management struct (or whatever other aggregate we end
up using), say call it MMP, will have to define MMP.string, MMP.jsonNode
(since parseJSON() need to allocate not only strings but JSON nodes),
MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ...

Nope, still don't see how this could work. Please clarify, kthx.


T
--
Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com
Kiith-Sa via Digitalmars-d
2014-10-01 18:28:14 UTC
Permalink
On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via
On Wed, Oct 01, 2014 at 02:51:08AM -0700, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Dmitry Olshansky via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
auto setExtension(MemoryManagementPolicy mmp = gc, R1,
R2)(R1 path, R2
ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
Incredible code bloat? Boilerplate in each function for the
win?
I'm at loss as to how it would make things better.
Sean's idea to make string an alias of the policy takes care
of this
concern. -- Andrei
But Sean's idea only takes strings into account. Strings aren't
the only
allocated resource Phobos needs to deal with. So extrapolating
from that
idea, each memory management struct (or whatever other
aggregate we end
up using), say call it MMP, will have to define MMP.string,
MMP.jsonNode
(since parseJSON() need to allocate not only strings but JSON
nodes),
MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ...
Nope, still don't see how this could work. Please clarify, kthx.
T
MMP.Ref!redBlackTreeNode ?

(where Ref is e.g. a ref-counted pointer type (like RefCounted
but with class support) for RC MMP but plain GC reference for GC
MMP, etc.)

I kinda like this idea, since it might possibly allow
user-defined memory management policies (which wouldn't get
special compiler treatment that e.g. RC may need, though).
Sean Kelly via Digitalmars-d
2014-10-01 18:37:49 UTC
Permalink
On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via
Post by H. S. Teoh via Digitalmars-d
But Sean's idea only takes strings into account. Strings aren't
the only
allocated resource Phobos needs to deal with. So extrapolating
from that
idea, each memory management struct (or whatever other
aggregate we end
up using), say call it MMP, will have to define MMP.string,
MMP.jsonNode
(since parseJSON() need to allocate not only strings but JSON
nodes),
MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ...
Nope, still don't see how this could work. Please clarify, kthx.
Assuming you're willing to take the memoryModel type as a
template argument, I imagine we could do something where the user
can specialize the memoryModel for their own types, a bit like
how information is derived for iterators in C++. The problem is
that this still means passing the memoryModel in as a template
argument. What I'd really want is for it to be a global, except
that templated virtuals is logically impossible. I guess
something could maybe be sorted out via a factory design, but
that's not terribly D-like. I'm at a loss for how to make this
memoryModel thing work the way I'd actually want it to if I were
to use it.
Cliff via Digitalmars-d
2014-10-01 19:23:44 UTC
Permalink
Post by Kiith-Sa via Digitalmars-d
On Wednesday, 1 October 2014 at 17:53:43 UTC, H. S. Teoh via
Post by H. S. Teoh via Digitalmars-d
But Sean's idea only takes strings into account. Strings
aren't the only
allocated resource Phobos needs to deal with. So extrapolating
from that
idea, each memory management struct (or whatever other
aggregate we end
up using), say call it MMP, will have to define MMP.string,
MMP.jsonNode
(since parseJSON() need to allocate not only strings but JSON
nodes),
MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ...
Nope, still don't see how this could work. Please clarify,
kthx.
Assuming you're willing to take the memoryModel type as a
template argument, I imagine we could do something where the
user
can specialize the memoryModel for their own types, a bit like
how information is derived for iterators in C++. The problem is
that this still means passing the memoryModel in as a template
argument. What I'd really want is for it to be a global, except
that templated virtuals is logically impossible. I guess
something could maybe be sorted out via a factory design, but
that's not terribly D-like. I'm at a loss for how to make this
memoryModel thing work the way I'd actually want it to if I were
to use it.
If you were to forget D restrictions for a moment, and consider
an idealized language, how would you express this? Maybe
providing that will trigger some ideas from people beyond what we
have seen so far by removing implied restrictions.
Andrei Alexandrescu via Digitalmars-d
2014-10-01 21:23:59 UTC
Permalink
Post by H. S. Teoh via Digitalmars-d
But Sean's idea only takes strings into account. Strings aren't the only
allocated resource Phobos needs to deal with. So extrapolating from that
idea, each memory management struct (or whatever other aggregate we end
up using), say call it MMP, will have to define MMP.string, MMP.jsonNode
(since parseJSON() need to allocate not only strings but JSON nodes),
MMP.redBlackTreeNode, MMP.listNode, MMP.userDefinedNode, ...
Nope, still don't see how this could work. Please clarify, kthx.
There's management for T[], pointers to structs, pointers to class
objects, associative arrays, and that covers everything. -- Andrei
via Digitalmars-d
2014-09-30 19:10:17 UTC
Permalink
Ok, here are my few cents:

On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have
a larger strategy in mind. Here it is.
The basic tenet of the approach is to reckon and act on the
fact that memory allocation (the subject of allocators) is an
entirely distinct topic from memory management, and more
generally resource management. This clarifies that it would be
wrong to approach alternatives to GC in Phobos by means of
allocators. GC is not only an approach to memory allocation,
but also an approach to memory management. Reducing it to
either one is a mistake. In hindsight this looks rather obvious
but it has caused me and many people better than myself a lot
of headache.
I would argue that GC is at its core _only_ a memory management
strategy. It just so happens that the one in D's runtime also
comes with an allocator, with which it is tightly integrated. In
theory, a GC can work with any (and multiple) allocators, and you
could of course also call GC.free() manually, because, as you
say, management and allocation are entirely distinct topics.
Post by Andrei Alexandrescu via Digitalmars-d
That said allocators are nice to have and use, and I will
definitely follow up with std.allocator. However, std.allocator
Agreed.
Post by Andrei Alexandrescu via Digitalmars-d
Nor are ranges. There is an attitude that either output ranges,
or input ranges in conjunction with lazy computation, would
solve the issue of creating garbage.
https://github.com/D-Programming-Language/phobos/pull/2423 is a
good illustration of the latter approach: a range would be
lazily created by chaining stuff together. A range-based
approach would take us further than the allocators, but I see
(a) the whole approach doesn't stand scrutiny for non-linear
outputs, e.g. outputting some sort of associative array or
really any composite type quickly becomes tenuous either with
an output range (eager) or with exposing an input range (lazy);
(b) makes the style of programming without GC radically
different, and much more cumbersome, than programming with GC;
as a consequence, programmers who consider changing one
approach to another, or implementing an algorithm neutral to
it, are looking at a major rewrite;
of character; technically, I have long gotten used to seeing
most elaborate C++ code like poor emulation of simple D idioms.
But C++ has spent years and decades taking to perfection an
approach without a tracing garbage collector. A departure from
that would need to be superior, and that doesn't seem to be the
case with range-based approaches.
I agree with this, too.
Post by Andrei Alexandrescu via Digitalmars-d
===========
Now that we clarified that these existing attempts are not
going to work well, the question remains what does. For Phobos
enum MemoryManagementPolicy { gc, rc, mrc }
immutable
gc = ResourceManagementPolicy.gc,
rc = ResourceManagementPolicy.rc,
mrc = ResourceManagementPolicy.mrc;
(a) gc is the classic garbage-collected style of management;
(b) rc is a reference-counted style still backed by the GC,
i.e. the GC will still be able to pick up cycles and other
kinds of leaks.
(c) mrc is a reference-counted style backed by malloc.
(It should be possible to collapse rc and mrc together and make
the distinction dynamically, at runtime. I'm distinguishing
them statically here for expository purposes.)
The policy is a template parameter to functions in Phobos (and
elsewhere), and informs the functions e.g. what types to
auto setExtension(MemoryManagementPolicy mmp = gc, R1, R2)(R1
path, R2 ext)
if (...)
{
static if (mmp == gc) alias S = string;
else alias S = RCString;
S result;
...
return result;
}
auto p1 = setExtension("hello", ".txt"); // fine, use gc
auto p2 = setExtension!gc("hello", ".txt"); // same
auto p3 = setExtension!rc("hello", ".txt"); // fine, use rc
So by default it's going to continue being business as usual,
but certain functions will allow passing in a (defaulted)
policy for memory management.
This, however, I disagree with strongly. For one thing - this has
already been noted by others - it would make the functions'
implementation extremely ugly (`static if` hell), it would make
them harder to unit test, and from a user's point of view, it's
very tedious and might interfere badly with UFCS.

But more importantly, IMO, it's the wrong thing to do. These
functions shouldn't know anything about memory management policy
at all. They allocate, which means they need to know about
_allocation_ policy, but memory _management_ policy needs to be
decided by the user.

Now, your suggestion in a way still leaves that decision to the
user, but does so in a very intrusive way, by passing a template
flag. This is clearly a violation of the separation of concerns.
Contrary to the typical case, implementation details of the
user's code leak into the library code, and not the other way
round, but that's just as bad.

I'm convinced this isn't necessary. Let's take `setExtension()`
as an example, standing in for any of a class of similar
functions. This function allocates memory, returns it, and
abandons it; it gives up ownership of the memory. The fact that
the memory has been freshly allocated means that it is (head)
unique, and therefore the caller (= library user) can take over
the ownership. This, in turn, means that the caller can decide
how she wants to manage it.

(I'll try to make a sketch on how this can be implemented in
another post.)

As a conclusion, I would say that APIs should strive for the
following principles, in this order:

1. Avoid allocation altogether, for example by laziness (ranges),
or by accepting sinks.

2. If allocations are necessary (or desirable, to make the API
more easily usable), try hard to return a unique value (this of
course needs to be expressed in the return type).

3. If both of the above fails, only then return a GCed pointer,
or alternatively provide several variants of the function (though
this shouldn't be necessary often). An interesting alternative:
Instead of passing a flag directly describing the policy, pass
the function a type that it should wrap it's return value in.

As for the _allocation_ strategy: It indeed needs to be
configurable, but here, the same objections against a template
parameter apply. As the allocator doesn't necessarily need to be
part of the type, a (thread) global variable can be used to
specify it. This lends itself well to idioms like

with(MyAllocator alloc) {
// ...
}
Post by Andrei Alexandrescu via Digitalmars-d
Destroy!
Done :-)
via Digitalmars-d
2014-09-30 19:50:30 UTC
Permalink
Post by via Digitalmars-d
I'm convinced this isn't necessary. Let's take `setExtension()`
as an example, standing in for any of a class of similar
functions. This function allocates memory, returns it, and
abandons it; it gives up ownership of the memory. The fact that
the memory has been freshly allocated means that it is (head)
unique, and therefore the caller (= library user) can take over
the ownership. This, in turn, means that the caller can decide
how she wants to manage it.
(I'll try to make a sketch on how this can be implemented in
another post.)
Ok. What we need for it:

1) @unique, or a way to expressly specify uniqueness on a
function's return type, as well as restrict function params by it
(and preferably overloading on uniqueness). DMD already has this
concept internally, it just needs to be formalized.

2) A few modifications to RefCounted to be constructable from
unique values.

3) A wrapper type similar to std.typecons.Unique, that also
supports moving. Let's called it Owned(T).

4) Borrowing.

setExtension() can then look like this:

Owned!string setExtension(in char[] path, in char[] ext);

To be used:

void saveFileAs(in char[] name) {
import std.path: setExtension;
import std.file: write;
name. // scope const(char[])
setExtension("txt"). // Owned!string
write(data);
}

The Owned(T) value implicitly converts to `scope!this(T)` via
alias this; it can therefore be conveniently passed to
std.file.write() (which already takes the filename as `in`)
without copying or moving. The value then is released
automatically at the end of the statement, because it is only a
temporary and is not assigned to a variable.

For transferring ownership:

RefCounted!string[] filenames;
// ...
filenames ~= name.setExtension("txt").release;

`Owned!T.release()` returns the payload as a unique value, and
resets the payload to it's init value (in this case `null`).
RefCounted's constructor then accepts this unique value and takes
ownership of it. When the Owned value's destructor is called, it
finds the payload to be null and doesn't free the memory.
Inlining and subsequent optimization can turn the destructor into
a no-op in this case.

Optionally, Owned!T can provide an `alias this` to its release
method; in this case, the method doesn't need to be called
explicitly. It is however debatable whether being explicit with
moving isn't the better choice.
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:52:50 UTC
Permalink
Post by via Digitalmars-d
I would argue that GC is at its core _only_ a memory management
strategy. It just so happens that the one in D's runtime also comes with
an allocator, with which it is tightly integrated. In theory, a GC can
work with any (and multiple) allocators, and you could of course also
call GC.free() manually, because, as you say, management and allocation
are entirely distinct topics.
I'm not very sure. A GC might need to interoperate closely with the
allocator. -- Andrei
via Digitalmars-d
2014-10-01 11:00:10 UTC
Permalink
On Wednesday, 1 October 2014 at 09:52:46 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by via Digitalmars-d
I would argue that GC is at its core _only_ a memory management
strategy. It just so happens that the one in D's runtime also
comes with
an allocator, with which it is tightly integrated. In theory,
a GC can
work with any (and multiple) allocators, and you could of
course also
call GC.free() manually, because, as you say, management and
allocation
are entirely distinct topics.
I'm not very sure. A GC might need to interoperate closely with
the allocator. -- Andrei
It needs to know what to scan (ideally with type info), and which
allocator to release memory with, but it doesn't need to be an
allocator itself. It certainly helps with the implementation, but
ideally there would be a well defined interface between
allocators and GCs, so that both can be plugged in as desired,
even with multiple GCs in parallel.
Oren Tirosh via Digitalmars-d
2014-10-01 15:48:38 UTC
Permalink
Post by via Digitalmars-d
[...]
I'm convinced this isn't necessary. Let's take `setExtension()`
as an example, standing in for any of a class of similar
functions. This function allocates memory, returns it, and
abandons it; it gives up ownership of the memory. The fact that
the memory has been freshly allocated means that it is (head)
unique, and therefore the caller (= library user) can take over
the ownership. This, in turn, means that the caller can decide
how she wants to manage it.
Bingo. Have some way to mark the function return type as a unique
pointer. This does not imply full-fledged unique pointer type
support in the language - just enough to have the caller ensure
continuity of memory management policy from there.

One problem with actually implementing this is that using
reference counting as a memory management policy requires extra
space for the reference counter in the object, just as garbage
collection requires support for scanning and identification of
interior object memory range. While allocation and memory
management may be quite independent in theory, practical high
performance implementations tend to be intimately related.
Post by via Digitalmars-d
(I'll try to make a sketch on how this can be implemented in
another post.)
Do elaborate!
Post by via Digitalmars-d
As a conclusion, I would say that APIs should strive for the
1. Avoid allocation altogether, for example by laziness
(ranges), or by accepting sinks.
2. If allocations are necessary (or desirable, to make the API
more easily usable), try hard to return a unique value (this of
course needs to be expressed in the return type).
3. If both of the above fails, only then return a GCed pointer,
or alternatively provide several variants of the function
(though this shouldn't be necessary often). An interesting
alternative: Instead of passing a flag directly describing the
policy, pass the function a type that it should wrap it's
return value in.
As for the _allocation_ strategy: It indeed needs to be
configurable, but here, the same objections against a template
parameter apply. As the allocator doesn't necessarily need to
be part of the type, a (thread) global variable can be used to
specify it. This lends itself well to idioms like
with(MyAllocator alloc) {
// ...
}
Assuming there is some dependency between the allocator and the
memory management policy I guess this would be initialized on
thread start that cannot be modified later. All code running
inside the thread would need to either match the configured
policy, not handle any kind of pointers or use a limited subset
of unique pointers. Another way to ensure that code can run on
either RC or GC is to make certain objects (specifically,
Exceptions) always allocate a reference counter, regardless of
the currently configured policy.
bearophile via Digitalmars-d
2014-10-01 15:58:49 UTC
Permalink
Post by Oren Tirosh via Digitalmars-d
Bingo. Have some way to mark the function return type as a
unique pointer. This does not imply full-fledged unique pointer
type support in the language
Let's have full-fledged memory zones tracking in the D type
system :-)

Bye,
bearophile
Andrei Alexandrescu via Digitalmars-d
2014-10-01 17:13:43 UTC
Permalink
Post by Oren Tirosh via Digitalmars-d
[...]
I'm convinced this isn't necessary. Let's take `setExtension()` as an
example, standing in for any of a class of similar functions. This
function allocates memory, returns it, and abandons it; it gives up
ownership of the memory. The fact that the memory has been freshly
allocated means that it is (head) unique, and therefore the caller (=
library user) can take over the ownership. This, in turn, means that
the caller can decide how she wants to manage it.
Bingo. Have some way to mark the function return type as a unique
pointer.
I'm skeptical about this approach (though clearly we need to explore it
for e.g. passing ownership of data across threads). For strings and
other "casual" objects I think we should focus on GC/RC strategies. This
is because people do things like:

auto s = setExtension(s1, s2);

and then attempt to use s as a regular variable (copy it etc). Making s
unique would make usage quite surprising and cumbersome.


Andrei
Oren T via Digitalmars-d
2014-10-01 17:25:37 UTC
Permalink
On Wednesday, 1 October 2014 at 17:13:38 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc SchÃŒtz
Post by via Digitalmars-d
[...]
I'm convinced this isn't necessary. Let's take
`setExtension()` as an
example, standing in for any of a class of similar functions. This
function allocates memory, returns it, and abandons it; it
gives up
ownership of the memory. The fact that the memory has been
freshly
allocated means that it is (head) unique, and therefore the
caller (=
library user) can take over the ownership. This, in turn,
means that
the caller can decide how she wants to manage it.
Bingo. Have some way to mark the function return type as a
unique
pointer.
I'm skeptical about this approach (though clearly we need to
explore it for e.g. passing ownership of data across threads).
For strings and other "casual" objects I think we should focus
auto s = setExtension(s1, s2);
and then attempt to use s as a regular variable (copy it etc).
Making s unique would make usage quite surprising and
cumbersome.
The idea is that the unique property is very short-lived: the
caller immediately assigns it to a pointer of the appropriate
policy: either RC or GC. This keeps the callee agnostic of the
chosen policy and does not require templating multiple versions
of the code. The allocator configured for the thread must match
the generated code at the call site i.e. if the caller uses RC
pointers the allocator must allocate space for the reference
counter (at negative offset to keep compatibility).
Andrei Alexandrescu via Digitalmars-d
2014-10-01 17:33:39 UTC
Permalink
The idea is that the unique property is very short-lived: the caller
immediately assigns it to a pointer of the appropriate policy: either RC
or GC. This keeps the callee agnostic of the chosen policy and does not
require templating multiple versions of the code. The allocator
configured for the thread must match the generated code at the call site
i.e. if the caller uses RC pointers the allocator must allocate space
for the reference counter (at negative offset to keep compatibility).
This all... looks arcane. I'm not sure how it can even made to work if
user code just uses "auto". -- Andrei
Oren T via Digitalmars-d
2014-10-01 18:28:15 UTC
Permalink
On Wednesday, 1 October 2014 at 17:33:34 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Oren T via Digitalmars-d
The idea is that the unique property is very short-lived: the
caller
immediately assigns it to a pointer of the appropriate policy: either RC
or GC. This keeps the callee agnostic of the chosen policy and does not
require templating multiple versions of the code. The allocator
configured for the thread must match the generated code at the call site
i.e. if the caller uses RC pointers the allocator must
allocate space
for the reference counter (at negative offset to keep
compatibility).
This all... looks arcane. I'm not sure how it can even made to
work if user code just uses "auto". -- Andrei
At the moment, @nogc code can't call any function returning a
pointer. Under this scheme @nogc is allowed to call either code
that returns an explicitly RC ty
Oren T via Digitalmars-d
2014-10-01 18:30:37 UTC
Permalink
On Wednesday, 1 October 2014 at 17:33:34 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Oren T via Digitalmars-d
The idea is that the unique property is very short-lived: the
caller
immediately assigns it to a pointer of the appropriate policy: either RC
or GC. This keeps the callee agnostic of the chosen policy and does not
require templating multiple versions of the code. The allocator
configured for the thread must match the generated code at the call site
i.e. if the caller uses RC pointers the allocator must
allocate space
for the reference counter (at negative offset to keep
compatibility).
This all... looks arcane. I'm not sure how it can even made to
work if user code just uses "auto". -- Andrei
At the moment, @nogc code can't call any function returning a
pointer. Under this scheme @nogc is allowed to call either code
that returns an explicitly RC type (Exception, RCString) or code
returning an "agnostic" unique pointer that may be used from
either @gc or @nogc code.
I already see some holes and problems, but I wonder if something
along these lines may be made to work.
Jacob Carlborg via Digitalmars-d
2014-10-02 06:29:24 UTC
Permalink
The idea is that the unique property is very short-lived: the caller
immediately assigns it to a pointer of the appropriate policy: either RC
or GC. This keeps the callee agnostic of the chosen policy and does not
require templating multiple versions of the code. The allocator
configured for the thread must match the generated code at the call site
i.e. if the caller uses RC pointers the allocator must allocate space
for the reference counter (at negative offset to keep compatibility).
Can't we do something like this, or it might be what you're proposing:

Foo foo () { return new Foo; }

@gc a = foo(); // a contains an instance of Foo allocated with the GC
@rc b = foo(); // b contains an instance of Foo allocated with the RC
allocator
--
/Jacob Carlborg
via Digitalmars-d
2014-10-02 09:41:10 UTC
Permalink
Post by Jacob Carlborg via Digitalmars-d
@gc a = foo(); // a contains an instance of Foo allocated with
the GC
@rc b = foo(); // b contains an instance of Foo allocated with
the RC allocator
That would be better, but how do you deal with "bar(foo())" ?
Context dependent instantiation is a semantic challenge when you
also have overloading, but I guess you can get somewhere if you
make whole program optimization mandatory and use a
state-of-the-art constraint solver to handle the type system.
Could lead you to NP-complete type resolution? But still doable
(in most cases).

I think you basically have 2 realistic choices if you want
easy-going syntax for the end user:

1. implement rc everywhere in standard libraries and make it
possible to turn off rc in a call-chain by having compiler
support (and whole program optimization). To support manual
management you need some kind of protocol for traversing
allocated data-structures to free them.

e.g.:

define memory strategy @malloc =
some
manual
allocation
strategy
description;

auto a = bar(foo()); // use gc or rc based on compiler flag
auto a = @rc( bar(foo()) ); // use rc in a gc context
auto a = @malloc( bar(foo()) ); // manual management (requires a
protocol for traversal of recursive datastructures)


2. provide allocation strategy as a parameter

e.g.:

auto a = foo(); // alloc with gc
auto a = foo!rc(); // alloc with rc
auto a = foo!malloc(); // alloc with malloc

But going the C++ way of having explicit allocators and
non-embedded reference counters (double indirectio) probably is
the easier solution in terms of bringing D to completion.

How many years are you going to spend on making D ref count by
default in a flawless and performant manner? Sure having RC being
as easy to use as GC is a nice idea, but if it turns out to be
either slower or more bug ridden than GC, then what is the point?

Note that:

1. A write to a ref-count means the 64 bytes cacheline is dirty
and has to be written back to memory. So you don't write 4 bytes,
you write to 64 bytes. That's pretty expensive.

2. The memory bus is increasingly becoming the bottle neck of
hardware architectures.

=> RC everywhere without heavy duty compiler/hardware support is
a bad long term idea.
Jacob Carlborg via Digitalmars-d
2014-10-02 11:41:13 UTC
Permalink
On 02/10/14 11:41, "Ola Fosheim GrÞstad"
That would be better, but how do you deal with "bar(foo())" ? Context
dependent instantiation is a semantic challenge when you also have
overloading, but I guess you can get somewhere if you make whole program
optimization mandatory and use a state-of-the-art constraint solver to
handle the type system. Could lead you to NP-complete type resolution?
But still doable (in most cases).
I haven't really thought how it could be implemented but I was hoping
that the caller could magically decide the allocation strategy instead
of the callee. It looks like Rust is doing something like that but I
haven't looked at it in detail.
--
/Jacob Carlborg
via Digitalmars-d
2014-10-01 20:56:33 UTC
Permalink
On Wednesday, 1 October 2014 at 17:13:38 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Post by Oren Tirosh via Digitalmars-d
Bingo. Have some way to mark the function return type as a
unique
pointer.
I'm skeptical about this approach (though clearly we need to
explore it for e.g. passing ownership of data across threads).
For strings and other "casual" objects I think we should focus
auto s = setExtension(s1, s2);
and then attempt to use s as a regular variable (copy it etc).
Making s unique would make usage quite surprising and
cumbersome.
Sure? I already showed in an example how it is possible to chain
calls seamlessly that return unique objects. The users would only
notice it when they are trying to make a real copy (i.e. not
borrowing). Do you think this happens frequently enough to be of
concern?
Andrei Alexandrescu via Digitalmars-d
2014-10-01 21:26:56 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Oren Tirosh via Digitalmars-d
Bingo. Have some way to mark the function return type as a unique
pointer.
I'm skeptical about this approach (though clearly we need to explore
it for e.g. passing ownership of data across threads). For strings and
other "casual" objects I think we should focus on GC/RC strategies.
auto s = setExtension(s1, s2);
and then attempt to use s as a regular variable (copy it etc). Making
s unique would make usage quite surprising and cumbersome.
Sure? I already showed in an example how it is possible to chain calls
seamlessly that return unique objects. The users would only notice it
when they are trying to make a real copy (i.e. not borrowing). Do you
think this happens frequently enough to be of concern?
I'd think so. -- Andrei
via Digitalmars-d
2014-10-01 20:51:06 UTC
Permalink
On Tuesday, 30 September 2014 at 19:10:19 UTC, Marc SchÃŒtz
One problem with actually implementing this is that using
reference counting as a memory management policy requires extra
space for the reference counter in the object, just as garbage
collection requires support for scanning and identification of
interior object memory range. While allocation and memory
management may be quite independent in theory, practical high
performance implementations tend to be intimately related.
Post by via Digitalmars-d
(I'll try to make a sketch on how this can be implemented in
another post.)
Do elaborate!
Post by via Digitalmars-d
As a conclusion, I would say that APIs should strive for the
1. Avoid allocation altogether, for example by laziness
(ranges), or by accepting sinks.
2. If allocations are necessary (or desirable, to make the API
more easily usable), try hard to return a unique value (this
of course needs to be expressed in the return type).
3. If both of the above fails, only then return a GCed
pointer, or alternatively provide several variants of the
function (though this shouldn't be necessary often). An
interesting alternative: Instead of passing a flag directly
describing the policy, pass the function a type that it should
wrap it's return value in.
As for the _allocation_ strategy: It indeed needs to be
configurable, but here, the same objections against a template
parameter apply. As the allocator doesn't necessarily need to
be part of the type, a (thread) global variable can be used to
specify it. This lends itself well to idioms like
with(MyAllocator alloc) {
// ...
}
Assuming there is some dependency between the allocator and the
memory management policy I guess this would be initialized on
thread start that cannot be modified later. All code running
inside the thread would need to either match the configured
policy, not handle any kind of pointers or use a limited subset
of unique pointers. Another way to ensure that code can run on
either RC or GC is to make certain objects (specifically,
Exceptions) always allocate a reference counter, regardless of
the currently configured policy.
I don't have all answers to these questions. Still, I'm convinced
this is doable.

A straight-forwarding and general way to convert a unique object
to a ref-counted one is to allocate new memory for it plus the
reference count, move the original object into it, and release
the original memory. This is safe, because there can be no
external pointers to the object, as it is unique. Of course, this
can be optimized if the allocator supports extending an
allocation. It could then preallocate a few extra bytes at the
end to make the extend operation always succeed, similar to your
suggestion to always allocate a reference counter.

I think the most difficult part is to find an efficient and
user-friendly way for the wrapper types to get at the allocator.
Maybe the allocators should all implement an interface (a real
one, not duck-typing). The wrappers (Owned, RC) can then include
a pointer to the allocator (or for RC, embed it next to the
reference count). This would make it possible to specify a
(thread) global default allocator at runtime, which all library
functions use by convention (for example let's call it `alloc`,
then they would call `alloc.make!MyStruct()`). At the same time,
it is safe to change the default allocator at any time, and to
use different allocators in parallel in the same thread.

The alternative is obviously a template parameter to the function
that returns the unique object. But this unfortunately is then
not restricted to just the function, but "infects" the return
type, too. And from there, it needs to spread to the RC wrapper,
or any containers. Thus we'd have incompatible RC types, which I
would imagine would be very inconvenient and restrictive.
Besides, it would probably be too tedious to specify the
allocator everywhere.

Therfore, I think the additional cost of an allocator interface
pointer is worth it. For Owned!T (with T being a pointer or
reference), it would just be two words, which we can return
efficiently. We already have slices doing that, and AFAIK there's
no significantly worse performance because of them.
Manu via Digitalmars-d
2014-10-01 01:53:18 UTC
Permalink
On 29 September 2014 20:49, Andrei Alexandrescu via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
[...]
Destroy!
Andrei
I generally like the idea, but my immediate concern is that it implies
that every function that may deal with allocation is a template.
This interferes with C/C++ compatibility in a pretty big way. Or more
generally, the idea of a lib. Does this mean that a lib will be
required to produce code for every permutation of functions according
to memory management strategy? Usually libs don't contain code for
uninstantiated templates.

With this in place, I worry that traditional use of libs, separate
compilation, external language linkage, etc, all become very
problematic.
Pervasive templates can only work well if all code is D code, and if
all code is compiled together.
Most non-OSS industry doesn't ship source, they ship libs. And if libs
are to become impractical, then dependencies become a problem; instead
of linking libphobos.so, you pretty much have to compile phobos
together with your app (already basically true for phobos, but it's
fairly unique).
What if that were a much larger library? What if you have 10s of
dependencies all distributed in this manner? Does it scale?

I guess this doesn't matter if this is only a proposal for phobos...
but I suspect the pattern will become pervasive if it works, and yeah,
I'm not sure where that leads.
&quot;Nordlöw&quot; via Digitalmars-d
2014-10-01 05:46:40 UTC
Permalink
On Monday, 29 September 2014 at 10:49:53 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have
a larger strategy in mind. Here it is.
Slightly related :)

https://github.com/D-Programming-Language/phobos/pull/2573
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:59:02 UTC
Permalink
Post by &quot;Nordlöw&quot; via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Back when I've first introduced RCString I hinted that we have a
larger strategy in mind. Here it is.
Slightly related :)
https://github.com/D-Programming-Language/phobos/pull/2573
Nice, thanks! -- Andrei
Andrei Alexandrescu via Digitalmars-d
2014-10-01 09:58:33 UTC
Permalink
Post by Manu via Digitalmars-d
I generally like the idea, but my immediate concern is that it implies
that every function that may deal with allocation is a template.
This interferes with C/C++ compatibility in a pretty big way. Or more
generally, the idea of a lib. Does this mean that a lib will be
required to produce code for every permutation of functions according
to memory management strategy? Usually libs don't contain code for
uninstantiated templates.
If a lib chooses one specific memory management policy, it can of course
be non-templated with regard to that. If it wants to offer its users the
choice, it would probably have to offer some templates.
Post by Manu via Digitalmars-d
With this in place, I worry that traditional use of libs, separate
compilation, external language linkage, etc, all become very
problematic.
Pervasive templates can only work well if all code is D code, and if
all code is compiled together.
Most non-OSS industry doesn't ship source, they ship libs. And if libs
are to become impractical, then dependencies become a problem; instead
of linking libphobos.so, you pretty much have to compile phobos
together with your app (already basically true for phobos, but it's
fairly unique).
What if that were a much larger library? What if you have 10s of
dependencies all distributed in this manner? Does it scale?
I guess this doesn't matter if this is only a proposal for phobos...
but I suspect the pattern will become pervasive if it works, and yeah,
I'm not sure where that leads.
Thanks for the point. I submit that Phobos has and will be different
from other D libraries; as the standard library, it has the role of
supporting widely varying needs, and as such it makes a lot of sense to
make it highly generic and configurable. Libraries that are for specific
domains can avail themselves of a narrower design scope.


Andrei
Loading...