Discussion:
std.allocator needs your help
(too old to reply)
Andrei Alexandrescu
2013-09-22 23:49:57 UTC
Permalink
Hello,


I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a whole new level.

Several details are still in flux, but at the top level it seems most
natural to divide things in two tiers:

1. Typed allocator, i.e. every request for allocation comes with the
exact type requested;

2. Untyped allocator - traffics exclusively in ubyte[].

Initially I thought it's impossible to separate the two, but it turns
out there are profound generic realities about untyped allocators that
make them the perfect foundation for typed allocators, without
considerable abstraction costs.

I need to prove that before I'm sure, because so far I only have an
archetype of untyped allocators, with typed allocators to follow.
There's always the possibility that some hitch shows itself only when
actually trying to implement various typed allocators.

Anyhow, here's the general interface of an untyped allocator. In fact
I'll showcase the simplest allocator - the NullAllocator, which has zero
memory to offer. (In spite of its do-absolutely-nothing stance, the
NullAllocator is surprisingly useful as a "terminator" for certain
layered allocators.) Anyhow, here it is (explanations follow):

struct NullAllocator
{
enum alignment = real.alignof;
enum size_t available = 0;
ubyte[] allocate(size_t s) shared { return null; }
bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
{ assert(b is null); return false; }
bool reallocate(ref ubyte[] b, size_t) shared
{ assert(b is null); return false; }
void deallocate(ubyte[] b) shared { assert(b is null); }
void collect() shared { }
void deallocateAll() shared { }
static shared NullAllocator it;
}

Primitives:

* alignment is a compile-time constant yielding the alignment of
allocated data. Many allocators offer the maximum alignment of the
platform (for which I used real.alignof above). This constant is
required for all allocators.

* available is a property that returns how many total (not necessarily
contiguous) bytes are available for allocation. The NullAllocator knows
statically it has 0 bytes available so it implements it as an enum.
Generally allocators will implement it as a @property. This property is
optional.

* allocate(s) returns a ubyte[] with length at least s, or null. (It
does not throw on out-of-memory because that would hurt composability;
it turns out many elemental allocators do naturally run out of memory.)
This method is required for all allocators. In most allocators this
method should be @safe.

* expand(b, minDelta, maxDelta) grows b's length by at least minDelta
(and on a best-effort basis by at least maxDelta) and returns true, or
does nothing and returns false. In most allocators this should be @safe.
(One key insight is that expand() can be made @safe whereas shrink() or
realloc() are most often not; such mini-epiphanies are very exciting
because they put the design on a beam guide with few principles and many
consequences.) The precondition is that b is null or has been previously
returned by allocate(). This method is optional.

* reallocate(b, s) reallocates (like C's realloc) b to at least s bytes
and returns true, or does nothing and returns false. It is possible that
the memory is moved. This method is optional, and usually unsafe. (This
was mostly introduced to accommodate C's realloc() within the allocator
design.)

* deallocate(b) deallocates the memory allocated for b. b must have been
previously allocated by the same allocator. This method is usually
unsafe (but there are notable implementations that may offer safety,
such as unbounded freelists.) This method is optional.

* deallocateAll() deallocates in one shot all memory previously
allocated by this allocator. This method is optional, and when present
is almost always unsafe (I see no meaningful @safe implementation.)
Region allocators are notable examples of allocators that define this
method.

* collect() frees unused memory that had been allocated with this
allocator. This optional method is implemented by tracing collectors and
is usually @safe.

* "it" is an interesting artifact. Allocators may or may not hold
per-instance state. Those that don't are required to define a global
shared or thread-local singleton called "it" that will be used for all
calls related to allocation. Of course, it is preferred for maximum
flexibility that "it" is shared so as to clarify that the allocator is
safe to use among different threads. In the case of the NullAllocator,
this is obviously true, but non-trivial examples are the malloc-based
allocator and the global garbage collector, both of which implement
thread-safe allocation (at great effort no less).

There are quite a few more things to define more precisely, but this
part of the design has become quite stable. To validate the design, I've
defined some simple allocators (Mallocator, GCAllocator, Freelist,
StackRegion, Region etc) but not the more involved ones, such as
coalescing heaps, buddy system etc.

If you have an interest in memory allocators and would like to design
(or have around) an allocator that fits the API above, please post it.
For layering purposes, don't call malloc() or sbrk() for getting large
blocks of memory from the system; instead, use another Allocator as your
parent allocator in a pattern as follows:

struct MyAllocator(Allocator)
{
static if (is(typeof(Allocator.it))) alias Allocator.it _parent;
else Allocator _parent;
... use _parent for getting memory ...
}

This allows MyAllocator to use stateful and stateless allocators
transparently by just referring to _parent. Of course, your allocator
may take other generic parameters in addition to Allocator, or take none
at all if it's a "root" allocator (sbrk() or Mallocator would be examples).

Destroy!


Andrei
Nicholas Londey
2013-09-23 01:34:00 UTC
Permalink
Post by Andrei Alexandrescu
* allocate(s) returns a ubyte[] with length at least s, or
null. (It does not throw on out-of-memory because that would
hurt composability; it turns out many elemental allocators do
naturally run out of memory.) This method is required for all
Would an 'expected' be a possible alternative to returning null?

Am a big fan of this talk of yours.
http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Andrei-Alexandrescu-Systematic-Error-Handling-in-C
Andrei Alexandrescu
2013-09-23 02:00:39 UTC
Permalink
Post by Nicholas Londey
Post by Andrei Alexandrescu
* allocate(s) returns a ubyte[] with length at least s, or null. (It
does not throw on out-of-memory because that would hurt composability;
it turns out many elemental allocators do naturally run out of
memory.) This method is required for all allocators. In most
Would an 'expected' be a possible alternative to returning null?
Am a big fan of this talk of yours.
http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Andrei-Alexandrescu-Systematic-Error-Handling-in-C
Thanks! I think expected would be appropriate for the high-level allocator.

Andrei
Manu
2013-09-23 01:35:36 UTC
Permalink
On 23 September 2013 09:49, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Hello,
I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really archetypal
- e.g. "this is the essence of a freelist allocator". It's a very good
feeling. The overall inspiration comes from Berger's HeapLayers, but D's
introspection takes that pattern to a whole new level.
Several details are still in flux, but at the top level it seems most
1. Typed allocator, i.e. every request for allocation comes with the exact
type requested;
2. Untyped allocator - traffics exclusively in ubyte[].
Initially I thought it's impossible to separate the two, but it turns out
there are profound generic realities about untyped allocators that make
them the perfect foundation for typed allocators, without considerable
abstraction costs.
I need to prove that before I'm sure, because so far I only have an
archetype of untyped allocators, with typed allocators to follow. There's
always the possibility that some hitch shows itself only when actually
trying to implement various typed allocators.
Anyhow, here's the general interface of an untyped allocator. In fact I'll
showcase the simplest allocator - the NullAllocator, which has zero memory
to offer. (In spite of its do-absolutely-nothing stance, the NullAllocator
is surprisingly useful as a "terminator" for certain layered allocators.)
struct NullAllocator
{
enum alignment = real.alignof;
enum size_t available = 0;
ubyte[] allocate(size_t s) shared { return null; }
bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
{ assert(b is null); return false; }
bool reallocate(ref ubyte[] b, size_t) shared
{ assert(b is null); return false; }
void deallocate(ubyte[] b) shared { assert(b is null); }
void collect() shared { }
void deallocateAll() shared { }
static shared NullAllocator it;
}
* alignment is a compile-time constant yielding the alignment of allocated
data. Many allocators offer the maximum alignment of the platform (for
which I used real.alignof above). This constant is required for all
allocators.
* available is a property that returns how many total (not necessarily
contiguous) bytes are available for allocation. The NullAllocator knows
statically it has 0 bytes available so it implements it as an enum.
optional.
* allocate(s) returns a ubyte[] with length at least s, or null. (It does
not throw on out-of-memory because that would hurt composability; it turns
out many elemental allocators do naturally run out of memory.) This method
is required for all allocators. In most allocators this method should be
@safe.
* expand(b, minDelta, maxDelta) grows b's length by at least minDelta (and
on a best-effort basis by at least maxDelta) and returns true, or does
realloc() are most often not; such mini-epiphanies are very exciting
because they put the design on a beam guide with few principles and many
consequences.) The precondition is that b is null or has been previously
returned by allocate(). This method is optional.
* reallocate(b, s) reallocates (like C's realloc) b to at least s bytes
and returns true, or does nothing and returns false. It is possible that
the memory is moved. This method is optional, and usually unsafe. (This was
mostly introduced to accommodate C's realloc() within the allocator design.)
* deallocate(b) deallocates the memory allocated for b. b must have been
previously allocated by the same allocator. This method is usually unsafe
(but there are notable implementations that may offer safety, such as
unbounded freelists.) This method is optional.
* deallocateAll() deallocates in one shot all memory previously allocated
by this allocator. This method is optional, and when present is almost
are notable examples of allocators that define this method.
* collect() frees unused memory that had been allocated with this
allocator. This optional method is implemented by tracing collectors and is
* "it" is an interesting artifact. Allocators may or may not hold
per-instance state. Those that don't are required to define a global shared
or thread-local singleton called "it" that will be used for all calls
related to allocation. Of course, it is preferred for maximum flexibility
that "it" is shared so as to clarify that the allocator is safe to use
among different threads. In the case of the NullAllocator, this is
obviously true, but non-trivial examples are the malloc-based allocator and
the global garbage collector, both of which implement thread-safe
allocation (at great effort no less).
There are quite a few more things to define more precisely, but this part
of the design has become quite stable. To validate the design, I've defined
some simple allocators (Mallocator, GCAllocator, Freelist, StackRegion,
Region etc) but not the more involved ones, such as coalescing heaps, buddy
system etc.
If you have an interest in memory allocators and would like to design (or
have around) an allocator that fits the API above, please post it. For
layering purposes, don't call malloc() or sbrk() for getting large blocks
of memory from the system; instead, use another Allocator as your parent
struct MyAllocator(Allocator)
{
static if (is(typeof(Allocator.it))) alias Allocator.it _parent;
else Allocator _parent;
... use _parent for getting memory ...
}
This allows MyAllocator to use stateful and stateless allocators
transparently by just referring to _parent. Of course, your allocator may
take other generic parameters in addition to Allocator, or take none at all
if it's a "root" allocator (sbrk() or Mallocator would be examples).
Destroy!
Well it looks like a good start, but the thing I'm left wondering after
reading this is still... how is it actually used?
I think the greatest challenge is finding a simple, clean and correct way
to actually specify which allocator should be used for making allocations
throughout your code, and perhaps more troublesome; within generic code,
and then layers of generic code.

Are you intending to be able to associate a particular allocator with a
class declaration?
What about a struct declaration?
What about a region of code (ie, a call tree/branch).
What if the given allocator should be used for most of the tree, except for
a couple of things beneath that always want to use their explicit allocator?

What if I want to associate an allocator instance, not just an allocator
type (ie, I don't want multiple instances of the same type(/s) of
allocators in my code, how are they shared?

It wasn't clear to me from your demonstration, but 'collect()' implies that
GC becomes allocator-aware; how does that work?

deallocateAll() and collect() may each free a whole lot of memory, but it
seems to me that they may not actually be aware of the individual
allocations they are freeing; how do the appropriate destructors for the
sub-allocations get called?

I have a suspicion you're going to answer most of these questions with the
concept of allocator layering, but I just don't completely see it. It's
quite an additional burden of resources and management to manage the
individual allocations with a range allocator above what is supposed to be
a performance critical allocator to begin with.

C++'s design seems reasonable in some ways, but history has demonstrated
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).

Some allocators that I use regularly to think about:

A ring-buffer:
* Like a region allocator I guess, but the circular nature adds some
minor details, and requires the user to mark the heap from time to time,
freeing all allocations between the old mark and the new mark.

A pool:
* Same as a free-list, but with a maximum size, ie, finite pool of
objects pre-allocated and pre-populating the freelist.

A pool-group:
* Allocate from a group of pools allocating differently sized objects.
(this is a good test for allocator layering, supporting a group of pools
above, and fallback to the malloc heap for large objects)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130923/2c21e07f/attachment-0001.html>
Andrei Alexandrescu
2013-09-23 02:28:42 UTC
Permalink
Post by Manu
Well it looks like a good start, but the thing I'm left wondering after
reading this is still... how is it actually used?
I think the greatest challenge is finding a simple, clean
Oxford comma please :o)
Post by Manu
and correct
way to actually specify which allocator should be used for making
allocations throughout your code, and perhaps more troublesome; within
generic code, and then layers of generic code.
My design makes it very easy to experiment by allowing one to define
complex allocators out of a few simple building blocks. It is not a
general-purpose allocator, but it allows one to define any number of such.
Post by Manu
Are you intending to be able to associate a particular allocator with a
class declaration?
No, or at least not at this level.
Post by Manu
What about a struct declaration?
Same answer.
Post by Manu
What about a region of code (ie, a call tree/branch).
What if the given allocator should be used for most of the tree, except
for a couple of things beneath that always want to use their explicit
allocator?
The proposed design makes it easy to create allocator objects. How they
are used and combined is left to the application.
Post by Manu
What if I want to associate an allocator instance, not just an allocator
type (ie, I don't want multiple instances of the same type(/s) of
allocators in my code, how are they shared?
An allocator instance is a variable like any other. So you use the
classic techniques (shared globals, thread-local globals, passing around
as parameter) for using the same allocator object from multiple places.
Post by Manu
It wasn't clear to me from your demonstration, but 'collect()' implies
that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One could
define a tracing allocator independent of the global GC.
Post by Manu
deallocateAll() and collect() may each free a whole lot of memory, but
it seems to me that they may not actually be aware of the individual
allocations they are freeing; how do the appropriate destructors for the
sub-allocations get called?
No destructors are called at this level. Again, all these allocators
understand is ubyte[].
Post by Manu
I have a suspicion you're going to answer most of these questions with
the concept of allocator layering, but I just don't completely see it.
No, it's just that at this level some of these questions don't even have
an appropriate answer - like we discuss atoms and molecules and you ask
about building floors and beams and pillars.
Post by Manu
It's quite an additional burden of resources and management to manage
the individual allocations with a range allocator above what is supposed
to be a performance critical allocator to begin with.
I don't understand this.
Post by Manu
C++'s design seems reasonable in some ways, but history has demonstrated
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion of
the proverbial exception that prove the rule.
Post by Manu
* Like a region allocator I guess, but the circular nature adds some
minor details, and requires the user to mark the heap from time to time,
freeing all allocations between the old mark and the new mark.
* Same as a free-list, but with a maximum size, ie, finite pool of
objects pre-allocated and pre-populating the freelist.
I implemented the finite size for a freelist.
Post by Manu
* Allocate from a group of pools allocating differently sized
objects. (this is a good test for allocator layering, supporting a group
of pools above, and fallback to the malloc heap for large objects)
I implemented that as well, it's one of the best designs I've defined in
my life.


Andrei
Brad Roberts
2013-09-23 02:45:17 UTC
Permalink
Post by Andrei Alexandrescu
Post by Manu
Are you intending to be able to associate a particular allocator with a
class declaration?
No, or at least not at this level.
Post by Manu
What about a struct declaration?
Same answer.
Post by Manu
What about a region of code (ie, a call tree/branch).
What if the given allocator should be used for most of the tree, except
for a couple of things beneath that always want to use their explicit
allocator?
The proposed design makes it easy to create allocator objects. How they are used and combined is
left to the application.
I think the question he's asking, which a lot of people are anxiously anticipating is... how does
the intersect with the parts of the language and core libraries that have been blocked (for a loose
definition of blocked) waiting for the allocator design. Primarily, but far from exclusively, the
container library.

Yes, the allocator design is clearly a lower level component, but it's also far easier than the
integration with the language and libraries.
Andrei Alexandrescu
2013-09-23 03:01:17 UTC
Permalink
Post by Brad Roberts
I think the question he's asking, which a lot of people are anxiously
anticipating is... how does the intersect with the parts of the language
and core libraries that have been blocked (for a loose definition of
blocked) waiting for the allocator design. Primarily, but far from
exclusively, the container library.
Yes, the allocator design is clearly a lower level component, but it's
also far easier than the integration with the language and libraries.
Containers and other objects that want to allow customization of their
allocation would be parameterized during compilation with an allocator
type. Functions that need to allocate memory may similarly accept a
parameter of allocator type.

One possibility I'm keeping in mind is that of defining a dynamic
interface (i.e. in the OOP sense) for a global allocator. Then people
can globally change what allocator must be used for operator new
invocations.

The work at the level described so far is quite orthogonal on these high
level choices. Right now it's all about a statically-typed allocator
that is good at allocating and deallocating octets.


Andrei
Manu
2013-09-23 04:19:26 UTC
Permalink
On 23 September 2013 13:01, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Brad Roberts
I think the question he's asking, which a lot of people are anxiously
anticipating is... how does the intersect with the parts of the language
and core libraries that have been blocked (for a loose definition of
blocked) waiting for the allocator design. Primarily, but far from
exclusively, the container library.
Yes, the allocator design is clearly a lower level component, but it's
also far easier than the integration with the language and libraries.
Containers and other objects that want to allow customization of their
allocation would be parameterized during compilation with an allocator
type. Functions that need to allocate memory may similarly accept a
parameter of allocator type.
You've just described C++ verbatim.
And you've previously agreed that C++'s approach totally failed. Can you
explain how this is different from C++, and how it solves the issues that
defeated that design?

One possibility I'm keeping in mind is that of defining a dynamic interface
Post by Andrei Alexandrescu
(i.e. in the OOP sense) for a global allocator. Then people can globally
change what allocator must be used for operator new invocations.
Right, this is getting warmer. It's a stark contrast to what you suggest
above though, and when I try and think this through, it gets very complex,
very fast.
I can't imagine how such a system could ever be declared safe. However,
this is more or less what I want. I don't know how to reconcile the 2
requirements.

How much have you thought on this? This is where I think some serious
creativity will need to be applied...

Following this train of thought, I can imagine a really nice end goal would
be that the existing GC is effectively plugged in as a library, and people
can easily substitute it for their own GC if they want/need to.

The work at the level described so far is quite orthogonal on these high
Post by Andrei Alexandrescu
level choices. Right now it's all about a statically-typed allocator that
is good at allocating and deallocating octets.
Andrei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130923/b5d0fa81/attachment.html>
Walter Bright
2013-09-23 05:37:46 UTC
Permalink
Following this train of thought, I can imagine a really nice end goal would be
that the existing GC is effectively plugged in as a library, and people can
easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses a
different GC for VisualD.

dmd knows naught about the GC.
Manu
2013-09-23 10:17:18 UTC
Permalink
Post by Walter Bright
Following this train of thought, I can imagine a really nice end goal would be
that the existing GC is effectively plugged in as a library, and people can
easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example, uses
a different GC for VisualD.
dmd knows naught about the GC.
Sure, true. But little things like separate it into its own lib, so you can
easily link a different one, or not link one at all.
Also, what about language support? How much language support is there
currently?
I don't think I could currently write the ref-counting GC that I'd like
without extensive language support.
inc/dec ref calls would need to be inserted all over the place by he
compiler, and optimiser would need to know how to remove redundant inc/dec
sequences.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130923/9a8b584d/attachment.html>
Andrei Alexandrescu
2013-09-23 14:05:12 UTC
Permalink
Post by Walter Bright
Following this train of thought, I can imagine a really nice end goal would be
that the existing GC is effectively plugged in as a library, and people can
easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example,
uses a different GC for VisualD.
Correct.
Post by Walter Bright
dmd knows naught about the GC.
Well except for plugging a library call for calls to new. But that's
expected.

(I'm already pretending delete doesn't exist :o).)


Andrei
Manu
2013-09-23 14:50:02 UTC
Permalink
On 24 September 2013 00:05, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Walter Bright
Following this train of thought, I can imagine a really nice end goal would be
that the existing GC is effectively plugged in as a library, and people can
easily substitute it for their own GC if they want/need to.
It already is, and has been from the beginning. Rainer, for example,
uses a different GC for VisualD.
Correct.
dmd knows naught about the GC.
Well except for plugging a library call for calls to new. But that's
expected.
(I'm already pretending delete doesn't exist :o).)
delete is important if your class is being allocated by a pool or
something...
But you said before, people won't use 'new' if they are using an allocator.
I'm really not sure that's a good idea.
Most people (library authors!) don't actually care about their memory
allocation, they will continue to use 'new', just because it's a keyword.
It also screws with generic code; X should be allocated with 'new', but Y
should be allocated with yAllocator.alloc()?
What if you decide that Z, which allocates with 'new', becomes a problem
and you want to switch it into a pool? You now need to track down every
instance of 'new Z', and change it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/c2883a5b/attachment-0001.html>
Andrei Alexandrescu
2013-09-23 15:02:17 UTC
Permalink
Post by Manu
delete is important if your class is being allocated by a pool or
something...
It's important as an API function, but not as a language primitive.
"new" should also have been a function.
Post by Manu
But you said before, people won't use 'new' if they are using an
allocator. I'm really not sure that's a good idea.
I don't see another way if people are to define and use multiple allocators.
Post by Manu
Most people (library authors!) don't actually care about their memory
allocation, they will continue to use 'new', just because it's a keyword.
"new considered harmful" etc.
Post by Manu
It also screws with generic code; X should be allocated with 'new', but
Y should be allocated with yAllocator.alloc()?
What if you decide that Z, which allocates with 'new', becomes a problem
and you want to switch it into a pool? You now need to track down every
instance of 'new Z', and change it.
We can only improve that situation by allowing people to replace the
global allocator with their own allocators. Again, there's a disconnect
here - I'm discussing "how to make it easy for people to define
allocators" and you discuss "how to make it possible for people to plug
allocators, once defined, as the global allocator". These are two
distinct endeavors. At the level I'm at, I'm concerned with making good
allocators easy to implement. You may say you don't care, and that's
good feedback, but it's what I have for the time being.


Andrei
Manu
2013-09-23 15:47:08 UTC
Permalink
On 24 September 2013 01:02, Andrei Alexandrescu <
Post by Manu
delete is important if your class is being allocated by a pool or
something...
It's important as an API function, but not as a language primitive. "new"
should also have been a function.
I like operator new. It goes blue, and the statement isn't riddled with
exclamation marks.
Imagine if new was a template (it would have to be a template). Allocating
a template would require nested template syntax every time, which is pretty
ugly.

X x = new!(X!arg(args...)); // ewoo, paren spam...

But you said before, people won't use 'new' if they are using an
Post by Manu
allocator. I'm really not sure that's a good idea.
I don't see another way if people are to define and use multiple allocators.
Well if an allocator is somehow associated with a class/struct, then 'new
MyClass' would invoke that allocator.
If you:
pushThreadAllocator(myAllocator);
X x = new X;
popThreadAllocator();

Then new will use your thread-local allocator.
Continue for global, fiber, etc.

Maybe there's opportunity for 'scope' to offer some nice convenience here?

Most people (library authors!) don't actually care about their memory
Post by Manu
allocation, they will continue to use 'new', just because it's a keyword.
"new considered harmful" etc.
It also screws with generic code; X should be allocated with 'new', but
Post by Manu
Y should be allocated with yAllocator.alloc()?
What if you decide that Z, which allocates with 'new', becomes a problem
and you want to switch it into a pool? You now need to track down every
instance of 'new Z', and change it.
We can only improve that situation by allowing people to replace the
global allocator with their own allocators. Again, there's a disconnect
here - I'm discussing "how to make it easy for people to define allocators"
and you discuss "how to make it possible for people to plug allocators,
once defined, as the global allocator". These are two distinct endeavors.
At the level I'm at, I'm concerned with making good allocators easy to
implement. You may say you don't care, and that's good feedback, but it's
what I have for the time being.
Well, I'm discussing "how do people affect/override 'new' in various
circumstances?"

But sure, I said before, if we're limiting this discussion to the API of an
allocator then it looks fine, I see no obvious issues.
But I think the question requires consideration of the intended goal to
make informed decisions about the design, even at this level.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/36e309dd/attachment.html>
Jacob Carlborg
2013-09-23 18:52:51 UTC
Permalink
Post by Manu
I like operator new. It goes blue, and the statement isn't riddled with
exclamation marks.
For Ruby, many editors and IDE's highlight standard built in functions
as keywords. Like "new", "require", "attr_accessor" (and friends) to
mention a few.
Post by Manu
Imagine if new was a template (it would have to be a template).
Allocating a template would require nested template syntax every time,
which is pretty ugly.
X x = new!(X!arg(args...)); // ewoo, paren spam...
Why not:

X x = new!X(args);
--
/Jacob Carlborg
Dmitry Olshansky
2013-09-23 20:04:07 UTC
Permalink
Post by Jacob Carlborg
Post by Manu
Imagine if new was a template (it would have to be a template).
Allocating a template would require nested template syntax every time,
which is pretty ugly.
X x = new!(X!arg(args...)); // ewoo, paren spam...
X x = new!X(args);
+1 Or call it make for the time being so as to not collide with the keyword.
--
Dmitry Olshansky
Johannes Pfau
2013-09-23 16:25:44 UTC
Permalink
Am Tue, 24 Sep 2013 00:50:02 +1000
[...] It also screws with generic code; X should be allocated with
'new', but Y should be allocated with yAllocator.alloc()?
What if you decide that Z, which allocates with 'new', becomes a
problem and you want to switch it into a pool? You now need to track
down every instance of 'new Z', and change it.
That's not a problem with the proposed GC allocator. We should add
proper overloads to emplace and a "create" function and then all
code should use create!(Allocator, MyClass)(args) or
create!MyClass(allocatorInstance, args).

New should then be considered as deprecated and replaced by
create!(GCAllocator, Class)().
Andrej Mitrovic
2013-09-23 16:36:57 UTC
Permalink
Post by Johannes Pfau
New should then be considered as deprecated and replaced by
create!(GCAllocator, Class)().
What? That's never gonna happen. For one thing, it looks ugly as hell.
And for another, it's going to break everything written in D.
Johannes Pfau
2013-09-23 16:50:34 UTC
Permalink
Am Mon, 23 Sep 2013 18:36:57 +0200
Post by Andrej Mitrovic
Post by Johannes Pfau
New should then be considered as deprecated and replaced by
create!(GCAllocator, Class)().
What? That's never gonna happen. For one thing, it looks ugly as hell.
And for another, it's going to break everything written in D.
That's why I say "considered" and not really deprecated. But if you
want to / have to write allocator aware code which can use the GC it's a
nice solution:

auto list(Payload, Allocator = GCAllocator)()
{
create!(Allocator) ...
}

Of course the API should be improved.
For example create could default to the GC allocator. Then it's
"auto a = new Class(...)" vs "auto a = create!Class(...)".

But IIRC when emplace was first implemented and class allocators were
removed it was quite clear that new could be easily replaced by a
template function and has no real place as a builtin anymore. It's just
too late to remove it.
Walter Bright
2013-09-23 18:24:12 UTC
Permalink
Post by Manu
Sure, true. But little things like separate it into its own lib, so you can
easily link a different one, or not link one at all.
To link in a different one, put the different one on the link command before
libphobos2.a. The different one will override the one in the library.
Post by Manu
Also, what about language support?
How it's done is deliberately not part of the language.
Post by Manu
How much language support is there currently?
Zero.
Post by Manu
I don't think I could currently write the ref-counting GC that I'd like without
extensive language support.
inc/dec ref calls would need to be inserted all over the place by he compiler,
and optimiser would need to know how to remove redundant inc/dec sequences.
A ref counting system is something entirely different than just plugging in
another GC.
Andrei Alexandrescu
2013-09-23 19:09:48 UTC
Permalink
Post by Walter Bright
Post by Manu
Sure, true. But little things like separate it into its own lib, so you can
easily link a different one, or not link one at all.
To link in a different one, put the different one on the link command
before libphobos2.a. The different one will override the one in the
library.
Post by Manu
Also, what about language support?
How it's done is deliberately not part of the language.
Post by Manu
How much language support is there currently?
Zero.
I think there's at least indirect support in e.g. generating typeinfo
objects appropriately. Many of the components of those typeinfos are
expressly defined for helping a tracing collector.

Andrei
Walter Bright
2013-09-23 19:50:10 UTC
Permalink
I think there's at least indirect support in e.g. generating typeinfo objects
appropriately. Many of the components of those typeinfos are expressly defined
for helping a tracing collector.
Those are not language features.

Also, the latest support for that is done with a library-defined template.
QAston
2013-09-23 07:25:55 UTC
Permalink
Post by Manu
You've just described C++ verbatim.
And you've previously agreed that C++'s approach totally
failed. Can you
explain how this is different from C++, and how it solves the
issues that
defeated that design?
One possibility I'm keeping in mind is that of defining a
dynamic interface
Once you have static allocator you can go dynamic easily:

interface DynamicAllocator // isAllocator==true
{
// whatever the interface is
}

struct StaticAllocator // isAllocator==true
{
}

struct Container(ALLOCATOR = DynamicAllocator) if
(isAllocator!ALLOCATOR)
{
this(ALLOCATOR a){}

}

class DynamicAllocatorWrapper(T) : DynamicAllocator if (is(T) ==
struct)
{
// wraps any static allocator to be callable using dynamic
interface
}

With that boilerplate you can easily create allocator agnostic
containers:

auto c = Container(new
DynamicAllocatorWrapper!(StaticAllocator)())

and you can create specialized version if you need speed very
much:

auto c = Container!(StaticAllocator)(StaticAllocator());
Andrei Alexandrescu
2013-09-23 14:03:12 UTC
Permalink
Post by Manu
On 23 September 2013 13:01, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
Containers and other objects that want to allow customization of
their allocation would be parameterized during compilation with an
allocator type. Functions that need to allocate memory may similarly
accept a parameter of allocator type.
You've just described C++ verbatim.
And you've previously agreed that C++'s approach totally failed. Can you
explain how this is different from C++, and how it solves the issues
that defeated that design?
As I just wrote in my previous post, the two mistakes in C++'s allocator
design are:

1. Allocators are parameterized by type.

2. Allocators cannot have state.

I fix both.
Post by Manu
One possibility I'm keeping in mind is that of defining a dynamic
interface (i.e. in the OOP sense) for a global allocator. Then
people can globally change what allocator must be used for operator
new invocations.
Right, this is getting warmer. It's a stark contrast to what you suggest
above though, and when I try and think this through, it gets very
complex, very fast.
I can't imagine how such a system could ever be declared safe. However,
this is more or less what I want. I don't know how to reconcile the 2
requirements.
There are possibilities, which I hope to get to soon.
Post by Manu
How much have you thought on this? This is where I think some serious
creativity will need to be applied...
Following this train of thought, I can imagine a really nice end goal
would be that the existing GC is effectively plugged in as a library,
and people can easily substitute it for their own GC if they want/need to.
Sean has already done that. The current GC is entirely replaceable (and
in particular extricable).


Andrei
Manu
2013-09-23 04:03:08 UTC
Permalink
On 23 September 2013 12:28, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Manu
Well it looks like a good start, but the thing I'm left wondering after
reading this is still... how is it actually used?
I think the greatest challenge is finding a simple, clean
Oxford comma please :o)
and correct
Post by Manu
way to actually specify which allocator should be used for making
allocations throughout your code, and perhaps more troublesome; within
generic code, and then layers of generic code.
My design makes it very easy to experiment by allowing one to define
complex allocators out of a few simple building blocks. It is not a
general-purpose allocator, but it allows one to define any number of such.
Oh okay, so this isn't really intended as a system then, so much a
suggested API?
That makes almost all my questions redundant. I'm interested in the system,
not the API of a single allocator (although your API looks fine to me).
I already have allocators I use in my own code. Naturally, they don't
inter-operate with anything, and that's what I thought std.allocator was
meant to address.

Are you intending to be able to associate a particular allocator with a
Post by Andrei Alexandrescu
Post by Manu
class declaration?
No, or at least not at this level.
What about a struct declaration?
Same answer.
What about a region of code (ie, a call tree/branch).
Post by Manu
What if the given allocator should be used for most of the tree, except
for a couple of things beneath that always want to use their explicit
allocator?
The proposed design makes it easy to create allocator objects. How they
are used and combined is left to the application.
Is that the intended limit of std.allocator's responsibility, or will
patterns come later?

Leaving the usage up to the application means we've gained nothing.
I already have more than enough allocators which I use throughout my code.
The problem is that they don't inter-operate, and certainly not with
foreign code/libraries.
This is what I hoped std.allocator would address.

What if I want to associate an allocator instance, not just an allocator
Post by Andrei Alexandrescu
Post by Manu
type (ie, I don't want multiple instances of the same type(/s) of
allocators in my code, how are they shared?
An allocator instance is a variable like any other. So you use the classic
techniques (shared globals, thread-local globals, passing around as
parameter) for using the same allocator object from multiple places.
Okay, that's fine... but this sort of manual management implies that I'm
using it explicitly. That's where it all falls down for me.

Eg, I want to use a library, it's allocation patterns are incompatible with
my application; I need to provide it with an allocator.
What now? Is every library responsible for presenting the user with a
mechanism for providing allocators? What if the author forgets? (a problem
I've frequently had to chase up in the past when dealing with 3rd party
libraries)

Once a library is designed to expect a user to supply an allocator, what
happens if the user doesn't? Fall-back logic/boilerplate exists in every
library I guess...
And does that mean that applications+libraries are required to ALWAYS
allocate through given allocator objects?
That effectively makes the new keyword redundant. And what about the GC?

I can't really consider std.allocator intil it presents some usage patterns.

It wasn't clear to me from your demonstration, but 'collect()' implies
Post by Andrei Alexandrescu
Post by Manu
that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One could
define a tracing allocator independent of the global GC.
I'm not sure what this means. Other than I gather that the GC and
allocators are fundamentally separate?
Is it possible to create a tracing allocator without language support? Does
the current language insert any runtime calls to support the GC?

I want a ref-counting GC for instance to replace the existing GC, but it's
impossible to implement one of them nicely without support from the
language, to insert implicit inc/dec ref calls all over the place, and to
optimise away redundant inc/dec sequences.

deallocateAll() and collect() may each free a whole lot of memory, but
Post by Andrei Alexandrescu
Post by Manu
it seems to me that they may not actually be aware of the individual
allocations they are freeing; how do the appropriate destructors for the
sub-allocations get called?
No destructors are called at this level. Again, all these allocators
understand is ubyte[].
Fair enough. That's a problem for another time then.

I have a suspicion you're going to answer most of these questions with
Post by Andrei Alexandrescu
Post by Manu
the concept of allocator layering, but I just don't completely see it.
No, it's just that at this level some of these questions don't even have
an appropriate answer - like we discuss atoms and molecules and you ask
about building floors and beams and pillars.
Yep, fair enough.
I just saw "I am making good progress on the design of std.allocator" and
presumed you had some thoughts about actual usage semantics of the system.
I can easily define an allocator to use in my own code if it's entirely up
to me how I use it, but that completely defeats the purpose of this
exercise.
Until there aren't standard usage patterns, practises, conventions that ALL
code follows, then we have nothing. I was hoping to hear your thoughts
about those details.

It's quite an additional burden of resources and management to manage
Post by Andrei Alexandrescu
Post by Manu
the individual allocations with a range allocator above what is supposed
to be a performance critical allocator to begin with.
I don't understand this.
It's irrelevant here.
But fwiw, in relation to the prior point about block-freeing a range
allocation; there will be many *typed* allocations within these ranges, but
a typical range allocator doesn't keep track of the allocations within.
This seems like a common problem that may or may not want to be addressed
in std.allocator.
If the answer is simply "your range allocator should keep track of the
offsets of allocations, and their types", then fine. But that seems like
boilerplate that could be automated, or maybe there is a different/separate
system for such tracking?

C++'s design seems reasonable in some ways, but history has demonstrated
Post by Andrei Alexandrescu
Post by Manu
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion of the
proverbial exception that prove the rule.
I think the main fail of C++'s design is that it mangles the type.
I don't think a type should be defined by the way it's memory is allocated,
especially since that could change from application to application, or even
call to call. For my money, that's the fundamental flaw in C++'s design.
Post by Andrei Alexandrescu
Post by Manu
* Like a region allocator I guess, but the circular nature adds some
minor details, and requires the user to mark the heap from time to time,
freeing all allocations between the old mark and the new mark.
* Same as a free-list, but with a maximum size, ie, finite pool of
objects pre-allocated and pre-populating the freelist.
I implemented the finite size for a freelist.
Post by Manu
* Allocate from a group of pools allocating differently sized
objects. (this is a good test for allocator layering, supporting a group
of pools above, and fallback to the malloc heap for large objects)
I implemented that as well, it's one of the best designs I've defined in
my life.
Well as an atom, as you say, it seems like a good first step.
I can't see any obvious issues, although I don't think I quite understand
the collect() function if it has no relation to the GC. What is it's
purpose?
If the idea is that you might implement some sort of tracking heap which is
able to perform a collect, how is that actually practical without language
support?

I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface would be
the same, ie, the language would understand this API and integrate it with
'new', and the GC... somehow.
If allocators are just an object like in C++ that people may or may not
use, I don't think it'll succeed as a system. I reckon it needs deep
language integration to be truly useful.

The key problem to solve is the friction between different libraries, and
different moments within a single application its self.
I feel almost like the 'current' allocator needs to be managed as some sort
of state-machine. Passing them manually down the callstack is no good. And
'hard' binding objects to their allocators like C++ is no good either.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130923/37b5f975/attachment-0001.html>
Andrei Alexandrescu
2013-09-23 13:58:42 UTC
Permalink
Post by Manu
On 23 September 2013 12:28, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
My design makes it very easy to experiment by allowing one to define
complex allocators out of a few simple building blocks. It is not a
general-purpose allocator, but it allows one to define any number of such.
Oh okay, so this isn't really intended as a system then, so much a
suggested API?
For some definition of "system" and "API", yes :o).
Post by Manu
That makes almost all my questions redundant. I'm interested in the
system, not the API of a single allocator (although your API looks fine
to me).
I already have allocators I use in my own code. Naturally, they don't
inter-operate with anything, and that's what I thought std.allocator was
meant to address.
Great. Do you have a couple of nontrivial allocators (heap, buddy system
etc) that could be adapted to the described API?
Post by Manu
The proposed design makes it easy to create allocator objects. How
they are used and combined is left to the application.
Is that the intended limit of std.allocator's responsibility, or will
patterns come later?
Some higher level design will come later. I'm not sure whether or not
you'll find it satisfying, for reasons I'll expand on below.
Post by Manu
Leaving the usage up to the application means we've gained nothing.
I already have more than enough allocators which I use throughout my
code. The problem is that they don't inter-operate, and certainly not
with foreign code/libraries.
This is what I hoped std.allocator would address.
Again, if you already have many allocators, please let me know if you
can share some.

std.allocator will prescribe a standard for defining allocators, with
which the rest of std will work, same as std.range prescribes a standard
for defining ranges, with which std.algorithm, std.format, and other
modules work. Clearly one could come back with "but I already have my
own ranges that use first/done/next instead of front/empty/popFront, so
I'm not sure what we're gaining here".
Post by Manu
An allocator instance is a variable like any other. So you use the
classic techniques (shared globals, thread-local globals, passing
around as parameter) for using the same allocator object from
multiple places.
Okay, that's fine... but this sort of manual management implies that I'm
using it explicitly. That's where it all falls down for me.
I think a disconnect here is that you think "it" where I think "them".
It's natural for an application to use one allocator that's not provided
by the standard library, and it's often the case that an application
defines and uses _several_ allocators for different parts of it. Then
the natural question arises, how to deal with these allocators, pass
them around, etc. etc.
Post by Manu
Eg, I want to use a library, it's allocation patterns are incompatible
with my application; I need to provide it with an allocator.
What now? Is every library responsible for presenting the user with a
mechanism for providing allocators? What if the author forgets? (a
problem I've frequently had to chase up in the past when dealing with
3rd party libraries)
If the author forgets and hardcodes a library to use malloc(), I have no
way around that.
Post by Manu
Once a library is designed to expect a user to supply an allocator, what
happens if the user doesn't? Fall-back logic/boilerplate exists in every
library I guess...
The library wouldn't need to worry as there would be the notion of a
default allocator (probably backed by the existing GC).
Post by Manu
And does that mean that applications+libraries are required to ALWAYS
allocate through given allocator objects?
Yes, they should.
Post by Manu
That effectively makes the new keyword redundant.
new will still be used to tap into the global shared GC. std.allocator
will provide other means of allocating memory.
Post by Manu
And what about the GC?
The current global GC is unaffected for the time being.
Post by Manu
I can't really consider std.allocator intil it presents some usage patterns.
Then you'd need to wait a little bit.
Post by Manu
It wasn't clear to me from your demonstration, but 'collect()' implies
that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One
could define a tracing allocator independent of the global GC.
I'm not sure what this means. Other than I gather that the GC and
allocators are fundamentally separate?
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from
the GC as NO_SCAN memory, and then does its own management inside that
block. User-level code allocates and frees e.g. strings or whatever from
that block, without the global GC intervening.
Post by Manu
Is it possible to create a tracing allocator without language support?
I think it is possible.
Post by Manu
Does the current language insert any runtime calls to support the GC?
Aside from operator new, I don't think so.
Post by Manu
I want a ref-counting GC for instance to replace the existing GC, but
it's impossible to implement one of them nicely without support from the
language, to insert implicit inc/dec ref calls all over the place, and
to optimise away redundant inc/dec sequences.
Unfortunately that's a chymera I had to abandon, at least at this level.
The problem is that installing an allocator does not get to define what
a pointer is and what a reference is. These are notions hardwired into
the language, so the notion of turning a switch and replacing the global
GC with a reference counting scheme is impossible at the level of a
library API.

(As an aside, you still need tracing for collecting cycles in a
transparent reference counting scheme, so it's not all roses.)

What I do hope to get to is to have allocators define their own pointers
and reference types. User code that uses those will be guaranteed
certain allocation behaviors.
Post by Manu
I can easily define an allocator to use in my own code if it's entirely
up to me how I use it, but that completely defeats the purpose of this
exercise.
It doesn't. As long as the standard prescribes ONE specific API for
defining untyped allocators, if you define your own to satisfy that API,
then you'll be able to use your allocator with e.g. std.container, just
the same as defining your own range as std.range requires allows you to
tap into std.algorithm.
Post by Manu
Until there aren't standard usage patterns, practises, conventions that
ALL code follows, then we have nothing. I was hoping to hear your
thoughts about those details.
It's quite an additional burden of resources and management to manage
the individual allocations with a range allocator above what is supposed
to be a performance critical allocator to begin with.
I don't understand this.
It's irrelevant here.
But fwiw, in relation to the prior point about block-freeing a range
allocation;
What is a "range allocation"?
Post by Manu
there will be many *typed* allocations within these ranges,
but a typical range allocator doesn't keep track of the allocations within.
Do you mean s/range/region/?
Post by Manu
This seems like a common problem that may or may not want to be
addressed in std.allocator.
If the answer is simply "your range allocator should keep track of the
offsets of allocations, and their types", then fine. But that seems like
boilerplate that could be automated, or maybe there is a
different/separate system for such tracking?
If you meant region, then yes that's boilerplate that hopefully will be
reasonably automated by std.allocator. (What I discussed so far predates
that stage of the design.)
Post by Manu
C++'s design seems reasonable in some ways, but history has demonstrated
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion
of the proverbial exception that prove the rule.
I think the main fail of C++'s design is that it mangles the type.
I don't think a type should be defined by the way it's memory is
allocated, especially since that could change from application to
application, or even call to call. For my money, that's the fundamental
flaw in C++'s design.
This is not a flaw as much as an engineering choice with advantages and
disadvantages on the relative merits of which reasonable people may
disagree.

There are two _fundamental_ flaws of the C++ allocator design, in the
sense that they are very difficult to argue in favor of and relatively
easy to argue against:

1. Allocators are parameterized by type; instead, individual allocations
should be parameterized by type.

2. There is no appropriate handling for allocators with state.

The proposed std.allocator design deals with (2) with care, and will
deal with (1) when it gets to typed allocators.
Post by Manu
Well as an atom, as you say, it seems like a good first step.
I can't see any obvious issues, although I don't think I quite
understand the collect() function if it has no relation to the GC. What
is it's purpose?
At this point collect() is only implemented by the global GC. It is
possible I'll drop it from the final design. However, it's also possible
that collect() will be properly defined as "collect all objects
allocated within this particular allocator that are not referred from
any objects also allocated within this allocator". I think that's a
useful definition.
Post by Manu
If the idea is that you might implement some sort of tracking heap which
is able to perform a collect, how is that actually practical without
language support?
Language support would be needed for things like scanning the stack and
the globals. But one can gainfully use a heap with semantics as
described just above, which requires no language support.
Post by Manu
I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface would
be the same, ie, the language would understand this API and integrate it
with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely
defined in std.range.
Post by Manu
If allocators are just an object like in C++ that people may or may not
use, I don't think it'll succeed as a system. I reckon it needs deep
language integration to be truly useful.
I guess that's to be seen.
Post by Manu
The key problem to solve is the friction between different libraries,
and different moments within a single application its self.
I feel almost like the 'current' allocator needs to be managed as some
sort of state-machine. Passing them manually down the callstack is no
good. And 'hard' binding objects to their allocators like C++ is no good
either.
I think it's understood that if a library chooses its own ways to
allocate memory, there's no way around that. The point of std.allocator
is that it defines a common interface that user code can work with.


Andrei
Simen Kjaeraas
2013-09-23 14:16:50 UTC
Permalink
Post by Andrei Alexandrescu
Post by Manu
I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface would
be the same, ie, the language would understand this API and integrate it
with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely
defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would
not work.
--
Simen
Andrei Alexandrescu
2013-09-23 14:49:20 UTC
Permalink
Post by Simen Kjaeraas
Post by Manu
I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface would
be the same, ie, the language would understand this API and integrate it
with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely >
defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would
not work.
Great point. I amend my claim.

Andrei
Manu
2013-09-23 14:53:21 UTC
Permalink
On 24 September 2013 00:49, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Simen Kjaeraas
Post by Manu
I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface
would
Post by Simen Kjaeraas
Post by Manu
be the same, ie, the language would understand this API and integrate
it
Post by Simen Kjaeraas
Post by Manu
with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely >
defined in std.range.
The language has some knowledge of ranges, or foreach (e; myRange) would
not work.
Great point. I amend my claim.
Mmm, precisely what I was talking about.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/5f520636/attachment.html>
Manu
2013-09-23 15:32:26 UTC
Permalink
On 23 September 2013 23:58, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Manu
On 23 September 2013 12:28, Andrei Alexandrescu
My design makes it very easy to experiment by allowing one to define
complex allocators out of a few simple building blocks. It is not a
general-purpose allocator, but it allows one to define any number of such.
Oh okay, so this isn't really intended as a system then, so much a
suggested API?
For some definition of "system" and "API", yes :o).
That makes almost all my questions redundant. I'm interested in the
Post by Manu
system, not the API of a single allocator (although your API looks fine
to me).
I already have allocators I use in my own code. Naturally, they don't
inter-operate with anything, and that's what I thought std.allocator was
meant to address.
Great. Do you have a couple of nontrivial allocators (heap, buddy system
etc) that could be adapted to the described API?
Err, not really actually. When I use custom allocator's, it's for
performance, which basically implies that it IS a trivial allocator :)
The common ones I use are: stack-based mark&release, circular buffers,
pools, pool groups (collection of different sized pools)... that might be
it actually. Very simple tools for different purposes.

The proposed design makes it easy to create allocator objects. How
Post by Andrei Alexandrescu
Post by Manu
they are used and combined is left to the application.
Is that the intended limit of std.allocator's responsibility, or will
patterns come later?
Some higher level design will come later. I'm not sure whether or not
you'll find it satisfying, for reasons I'll expand on below.
Leaving the usage up to the application means we've gained nothing.
Post by Manu
I already have more than enough allocators which I use throughout my
code. The problem is that they don't inter-operate, and certainly not
with foreign code/libraries.
This is what I hoped std.allocator would address.
Again, if you already have many allocators, please let me know if you can
share some.
std.allocator will prescribe a standard for defining allocators, with
which the rest of std will work, same as std.range prescribes a standard
for defining ranges, with which std.algorithm, std.format, and other
modules work. Clearly one could come back with "but I already have my own
ranges that use first/done/next instead of front/empty/popFront, so I'm not
sure what we're gaining here".
No, it's just that I'm saying std.allocator needs to do a lot more than
define a contract before I can start to consider if it solves my problems.
This is a good first step though, I'm happy to discuss this, but I think
discussion about the practical application may also reveal design details
at this level.

It's like you say, I can rename my allocator's methods to suit an agreed
standard, that'll take me 2 minutes, but it's how the rest of the universe
interacts with that API that matters, and if it effectively solves my
problems.

An allocator instance is a variable like any other. So you use the
Post by Andrei Alexandrescu
Post by Manu
classic techniques (shared globals, thread-local globals, passing
around as parameter) for using the same allocator object from
multiple places.
Okay, that's fine... but this sort of manual management implies that I'm
using it explicitly. That's where it all falls down for me.
I think a disconnect here is that you think "it" where I think "them".
It's natural for an application to use one allocator that's not provided by
the standard library, and it's often the case that an application defines
and uses _several_ allocators for different parts of it. Then the natural
question arises, how to deal with these allocators, pass them around, etc.
etc.
No, I certainly understand you mean 'them', but you lead to what I'm
asking, how do these things get carried/passed around. Are they discreet,
or will they invade argument lists everywhere? Are they free to flow in/out
of libraries in a natural way?
These patterns are what will define the system as I see it.
Perhaps more importantly, where do these allocators get their memory
themselves (if they're not a bottom-level allocator)? Global override
perhaps, or should a memory source always be explicitly provided to a
non-bottom-level allocator?

Eg, I want to use a library, it's allocation patterns are incompatible
Post by Andrei Alexandrescu
Post by Manu
with my application; I need to provide it with an allocator.
What now? Is every library responsible for presenting the user with a
mechanism for providing allocators? What if the author forgets? (a
problem I've frequently had to chase up in the past when dealing with
3rd party libraries)
If the author forgets and hardcodes a library to use malloc(), I have no
way around that.
Sure, but the common case is that the author will almost certainly use
keyword 'new'. How can I affect that as a 3rd party?
This would require me overriding the global allocator somehow... which you
touched on earlier.

Once a library is designed to expect a user to supply an allocator, what
Post by Andrei Alexandrescu
Post by Manu
happens if the user doesn't? Fall-back logic/boilerplate exists in every
library I guess...
The library wouldn't need to worry as there would be the notion of a
default allocator (probably backed by the existing GC).
Right. So it's looking like like the ability to override the global
allocator is a critical requirement.

And does that mean that applications+libraries are required to ALWAYS
Post by Andrei Alexandrescu
Post by Manu
allocate through given allocator objects?
Yes, they should.
Then we make keyword 'new' redundant?

That effectively makes the new keyword redundant.
Post by Andrei Alexandrescu
new will still be used to tap into the global shared GC. std.allocator
will provide other means of allocating memory.
I think the system will fail here. People will use 'new', siomply because
it's a keyword. Once that's boxed in a library, I will no longer be able to
affect that inconsiderate behaviour from my application.
Again, I think this signals that a global override is necessary.

And what about the GC?
Post by Andrei Alexandrescu
The current global GC is unaffected for the time being.
I can't really consider std.allocator intil it presents some usage
Post by Manu
patterns.
Then you'd need to wait a little bit.
Okay.

It wasn't clear to me from your demonstration, but 'collect()'
Post by Andrei Alexandrescu
Post by Manu
implies
that GC becomes allocator-aware; how does that work?
No, each allocator has its own means of dealing with memory. One
could define a tracing allocator independent of the global GC.
I'm not sure what this means. Other than I gather that the GC and
allocators are fundamentally separate?
Yes, they'd be distinct. Imagine an allocator that requests 4 MB from the
GC as NO_SCAN memory, and then does its own management inside that block.
User-level code allocates and frees e.g. strings or whatever from that
block, without the global GC intervening.
Yup, that's fine. But what if the GC isn't the bottom level? There's just
another allocator underneath.
What I'm saying is, the GC should *be* an allocator, not be a separate
entity.

I want to eliminate the GC from my application. Ideally, in the future, it
can be replaced with an ARC, which I have become convinced is the right
choice for my work.

Is it possible to create a tracing allocator without language support?
Post by Andrei Alexandrescu
I think it is possible.
Does the current language insert any runtime calls to support the GC?
Aside from operator new, I don't think so.
Okay, so a flexible lowering of 'new' is all we need for now?
It will certainly need substantially more language support for ARC.

I want a ref-counting GC for instance to replace the existing GC, but
Post by Andrei Alexandrescu
Post by Manu
it's impossible to implement one of them nicely without support from the
language, to insert implicit inc/dec ref calls all over the place, and
to optimise away redundant inc/dec sequences.
Unfortunately that's a chymera I had to abandon, at least at this level.
And there's the part you said I'm not going to like? ;)

The problem is that installing an allocator does not get to define what a
Post by Andrei Alexandrescu
pointer is and what a reference is.
Why not? A pointer has a type, like anything else. An ARC pointer can
theoretically have the compiler insert ARC magic.
That does imply though that the allocator affects the type, which I don't
like... I'll think on it.

These are notions hardwired into the language, so the notion of turning a
Post by Andrei Alexandrescu
switch and replacing the global GC with a reference counting scheme is
impossible at the level of a library API.
Indeed it is. So is this API being built upon an incomplete foundation? Is
there something missing, and can it be added later, or will this design
cement some details that might need changing in the future? (we all know
potentially breaking changes like that will never actually happen)

(As an aside, you still need tracing for collecting cycles in a transparent
Post by Andrei Alexandrescu
reference counting scheme, so it's not all roses.)
It's true, but it's possible to explicitly control all those factors. It
remains deterministic.

What I do hope to get to is to have allocators define their own pointers
Post by Andrei Alexandrescu
and reference types. User code that uses those will be guaranteed certain
allocation behaviors.
Interesting, will this mangle the pointer type, or the object type being
pointed to? The latter is obviously not desirable. Does the former actually
work in theory?

I can easily define an allocator to use in my own code if it's entirely
Post by Andrei Alexandrescu
Post by Manu
up to me how I use it, but that completely defeats the purpose of this
exercise.
It doesn't. As long as the standard prescribes ONE specific API for
defining untyped allocators, if you define your own to satisfy that API,
then you'll be able to use your allocator with e.g. std.container, just the
same as defining your own range as std.range requires allows you to tap
into std.algorithm.
I realise this. That's all fine.

Until there aren't standard usage patterns, practises, conventions that
Post by Andrei Alexandrescu
Post by Manu
ALL code follows, then we have nothing. I was hoping to hear your
thoughts about those details.
It's quite an additional burden of resources and management to
Post by Manu
manage
the individual allocations with a range allocator above what is supposed
to be a performance critical allocator to begin with.
I don't understand this.
It's irrelevant here.
But fwiw, in relation to the prior point about block-freeing a range
allocation;
What is a "range allocation"?
there will be many *typed* allocations within these ranges,
Post by Manu
but a typical range allocator doesn't keep track of the allocations within.
Do you mean s/range/region/?
Yes.

This seems like a common problem that may or may not want to be
Post by Andrei Alexandrescu
Post by Manu
addressed in std.allocator.
If the answer is simply "your range allocator should keep track of the
offsets of allocations, and their types", then fine. But that seems like
boilerplate that could be automated, or maybe there is a
different/separate system for such tracking?
If you meant region, then yes that's boilerplate that hopefully will be
reasonably automated by std.allocator. (What I discussed so far predates
that stage of the design.)
C++'s design seems reasonable in some ways, but history has
Post by Manu
demonstrated
that it's a total failure, which is almost never actually used (I've
certainly never seen anyone use it).
Agreed. I've seen some uses of it that quite fall within the notion
of the proverbial exception that prove the rule.
I think the main fail of C++'s design is that it mangles the type.
I don't think a type should be defined by the way it's memory is
allocated, especially since that could change from application to
application, or even call to call. For my money, that's the fundamental
flaw in C++'s design.
This is not a flaw as much as an engineering choice with advantages and
disadvantages on the relative merits of which reasonable people may
disagree.
There are two _fundamental_ flaws of the C++ allocator design, in the
sense that they are very difficult to argue in favor of and relatively easy
1. Allocators are parameterized by type; instead, individual allocations
should be parameterized by type.
2. There is no appropriate handling for allocators with state.
The proposed std.allocator design deals with (2) with care, and will deal
with (1) when it gets to typed allocators.
Fair enough. These are certainly more critical mistakes than the one I
raised.
I'm trying to remember the details of the practical failures I ran into
trying to use C++ allocators years ago.
Eventually, experience proved to us (myself and colleagues) that it wasn't
worth the mess, and we simply pursued a more direct solution. I've heard
similar stories from friends in other companies...
I need to try and recall the specific scenarios though, they might be
interesting :/ .. (going back the better part of a decade >_<)

Well as an atom, as you say, it seems like a good first step.
Post by Andrei Alexandrescu
Post by Manu
I can't see any obvious issues, although I don't think I quite
understand the collect() function if it has no relation to the GC. What
is it's purpose?
At this point collect() is only implemented by the global GC. It is
possible I'll drop it from the final design. However, it's also possible
that collect() will be properly defined as "collect all objects allocated
within this particular allocator that are not referred from any objects
also allocated within this allocator". I think that's a useful definition.
Perhaps. I'm not sure how this situation arises though. Unless you've
managed to implement your own GC inside an allocator.

If the idea is that you might implement some sort of tracking heap which
Post by Andrei Alexandrescu
Post by Manu
is able to perform a collect, how is that actually practical without
language support?
Language support would be needed for things like scanning the stack and
the globals. But one can gainfully use a heap with semantics as described
just above, which requires no language support.
I had imagined going into this that, like the range interface which the
Post by Manu
_language_ understands and interacts with, the allocator interface would
be the same, ie, the language would understand this API and integrate it
with 'new', and the GC... somehow.
The D language has no idea what a range is. The notion is completely
defined in std.range.
If allocators are just an object like in C++ that people may or may not
Post by Manu
use, I don't think it'll succeed as a system. I reckon it needs deep
language integration to be truly useful.
I guess that's to be seen.
I think a critical detail to keep in mind, is that (I suspect) people
simply won't use it if it doesn't interface with keyword 'new'.
It also complicates generic code, and makes it more difficult to retrofit
an allocator where 'new' is already in use.

The key problem to solve is the friction between different libraries,
Post by Andrei Alexandrescu
Post by Manu
and different moments within a single application its self.
I feel almost like the 'current' allocator needs to be managed as some
sort of state-machine. Passing them manually down the callstack is no
good. And 'hard' binding objects to their allocators like C++ is no good
either.
I think it's understood that if a library chooses its own ways to allocate
memory, there's no way around that.
Are we talking here about explicit choice for sourcing memory, or just that
the library allocates through the default/GC?

This is the case where I like to distinguish a bottom-level allocator from
a high-level allocator.
A library probably wants to use some patterns for allocation of it's
object, these are high-level allocators, but where it sources it's memory
from still needs to be overridable.
It's extremely common that I want to enforce that a library exist entirely
within a designated heap. It can't fall back to the global GC.

I work on platforms where memory is not unified. Different resources need
to go into different heaps.
It has happened on numerous occasions that we have been denied a useful
library simply because the author did not provide allocation hooks, and the
author was not responsive to requests... leading to my favourite scenario
of re-inventing yet another wheel (the story of my career).
It shouldn't be the case that the author has to manually account for the
possibility that someone might want to provide a heap for the libraries
resources.

This is equally true for the filesystem (a gripe I haven't really raised in
D yet).

The point of std.allocator is that it defines a common interface that user
Post by Andrei Alexandrescu
code can work with.
Andrei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/cf12c117/attachment-0001.html>
deadalnix
2013-09-23 17:36:07 UTC
Permalink
On Monday, 23 September 2013 at 13:58:42 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
If the author forgets and hardcodes a library to use malloc(),
I have no way around that.
We should consider the lib writer use new and/or automatic
allocation like for closures. In practice, this is what will
happen.
Post by Andrei Alexandrescu
new will still be used to tap into the global shared GC.
std.allocator will provide other means of allocating memory.
That uterly suck and we can do better using type informations.
Post by Andrei Alexandrescu
At this point collect() is only implemented by the global GC.
It is possible I'll drop it from the final design. However,
it's also possible that collect() will be properly defined as
"collect all objects allocated within this particular allocator
that are not referred from any objects also allocated within
this allocator". I think that's a useful definition.
I don't think collect is usefull with no type information (not
even if the data may contain pointers).
Post by Andrei Alexandrescu
The D language has no idea what a range is. The notion is
completely defined in std.range.
foreach has some idea of what a range is.
Andrei Alexandrescu
2013-09-23 17:53:06 UTC
Permalink
Post by Manu
On 23 September 2013 23:58, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
This is a good first step though, I'm happy to discuss this, but I think
discussion about the practical application may also reveal design
details at this level.
Absolutely. This is refreshing since I've gone through the cycle of
"types must be integrated into allocators and with the global GC" ... ->
... "untyped allocators can actually be extricated" once. It is now
great to revisit the assumptions again.

One matter I'm rather surprised didn't come up (I assumed everyone would
be quite curious about it) is that the allocator does not store the
allocated size, or at least does not allow it to be discovered easily.
This is a definitely appropriate topic for the design at hand.

The malloc-style APIs go like:

void* malloc(size_t size);
void free(void*);

Note that the user doesn't pass the size to free(), which means the
allocator is burdened with inferring the size of the block from the
pointer efficiently. Given that most allocators make crucial strategic
choices depending on the size requested, this API shortcoming is a bane
of allocator writers everywhere, and a source of inefficiency in
virtually all contemporary malloc-style allocators.

This is most ironic since there is next to nothing that user code can do
with a pointer without knowing the extent of memory addressed by it. (A
notable exception is zero-terminated strings, another questionable
design decision.) It all harkens back to Walter's claim of "C's biggest
mistake" (with which I agree) of not defining a type for a bounded
memory region.

Upon studying a few extant allocators and D's lore, I decided that in D
things have evolved sufficiently to have the user pass the size upon
deallocation as well:

void[] allocate(size_t size);
void deallocate(void[] buffer);

This is because the size of D objects is naturally known: classes have
it in the classinfo, slices store it, and the few cases of using bald
pointers for allocation are irrelevant and unrecommended.

This all makes memory allocation in D a problem fundamentally simpler
than in C, which is quite an interesting turn :o).
Post by Manu
No, I certainly understand you mean 'them', but you lead to what I'm
asking, how do these things get carried/passed around. Are they
discreet, or will they invade argument lists everywhere?
Since there's a notion of "the default" allocator, there will be ways to
push/pop user-defined allocators that temporarily (or permanently)
replace the default allocator. This stage of the design is concerned
with allowing users to define such user-defined allocators without
having to implement them from scratch.
Post by Manu
Are they free
to flow in/out of libraries in a natural way?
These patterns are what will define the system as I see it.
Perhaps more importantly, where do these allocators get their memory
themselves (if they're not a bottom-level allocator)? Global override
perhaps, or should a memory source always be explicitly provided to a
non-bottom-level allocator?
There are a few obvious bottom allocators, such as malloc, new, sbrk, or
Windows heap. All other allocators will compose by parameterizing on
their parent allocator.
Post by Manu
Sure, but the common case is that the author will almost certainly use
keyword 'new'. How can I affect that as a 3rd party?
This would require me overriding the global allocator somehow... which
you touched on earlier.
The way I'm thinking of this is to install/uninstall user-defined
allocators that will satisfy calls to new. Since not all allocators
support tracing, some would require the user to deallocate stuff manually.
Post by Manu
The library wouldn't need to worry as there would be the notion of a
default allocator (probably backed by the existing GC).
Right. So it's looking like like the ability to override the global
allocator is a critical requirement.
Again, this is difficult to handle in the same conversation as "should
we pass size upon deallocation". Yes, we need road laws, but it's hard
to talk about those and engine lubrication in the same breath. For me,
"critical" right now is to assess whether the untyped API misses an
important category of allocators, what safety level it has (that's why
"expand" is so different from "realloc"!) etc.
Post by Manu
And does that mean that applications+libraries are required to ALWAYS
allocate through given allocator objects?
Yes, they should.
Then we make keyword 'new' redundant?
Probably not. Typed allocators will need to integrate with (and
occasionally replace) the global shared GC.
Post by Manu
The problem is that installing an allocator does not get to define
what a pointer is and what a reference is.
Why not?
Because that requires a language change. I'm not sure you realize but
you move the goalposts all the time. We were talking within the context
of libraries and installing allocators dynamically and all of a sudden
you get to change what the compiler thinks a pointer is.
Post by Manu
A pointer has a type, like anything else. An ARC pointer can
theoretically have the compiler insert ARC magic.
That does imply though that the allocator affects the type, which I
don't like... I'll think on it.
I talked Walter's ear off a while ago at an ACCU conference about the
notion that reference counting could be a switch passed to the compiler.
Recently he's authored a DIP about the compiler inserting refcounting
primitives in the generated code. Unfortunately I think the DIP is
awfully incomplete and superficial (to the extent it would bring more
damage to the language if any implementation effort would be invested in
it at this point).

Can it be done? Probably. But it would be an absolutely major effort.
Post by Manu
These are notions hardwired into the language, so the notion of
turning a switch and replacing the global GC with a reference
counting scheme is impossible at the level of a library API.
Indeed it is. So is this API being built upon an incomplete foundation?
The API is quite good at what it puports to do.
Post by Manu
What I do hope to get to is to have allocators define their own
pointers and reference types. User code that uses those will be
guaranteed certain allocation behaviors.
Interesting, will this mangle the pointer type, or the object type being
pointed to? The latter is obviously not desirable. Does the former
actually work in theory?
I don't think I understand what you mean. Honest, it seems to me you're
confused about what you want, how it can be done, and what moving pieces
are involved.

One example is Rust, which defines several pointer types to implement
its scoping and borrowing semantics. I think it's not easy to achieve
its purported semantics with fewer types, and time will tell whether
programmers will put up with the complexity for the sake of the
corresponding benefits.
Post by Manu
At this point collect() is only implemented by the global GC. It is
possible I'll drop it from the final design. However, it's also
possible that collect() will be properly defined as "collect all
objects allocated within this particular allocator that are not
referred from any objects also allocated within this allocator". I
think that's a useful definition.
Perhaps. I'm not sure how this situation arises though. Unless you've
managed to implement your own GC inside an allocator.
That should be doable within D today.
Post by Manu
I think a critical detail to keep in mind, is that (I suspect) people
simply won't use it if it doesn't interface with keyword 'new'.
I think this is debatable. For one, languages such as Java and C++ still
have built-in "new" but quite ubiquitously unrecommend their usage in
user code. Far as I can tell that's been a successful campaign.

For two, there are allocations that don't even entail calls to new, such
as array concatenation.
Post by Manu
It's extremely common that I want to enforce that a library exist
entirely within a designated heap. It can't fall back to the global GC.
I work on platforms where memory is not unified. Different resources
need to go into different heaps.
The proposed API would make that entirely possible. You seem to care
only with the appendix "and mind you I only want to use new for all of
those!" which is a bit out there. But nevertheless good feedback.


Andrei
Manu
2013-09-24 05:47:57 UTC
Permalink
On 24 September 2013 03:53, Andrei Alexandrescu <
Post by Manu
On 23 September 2013 23:58, Andrei Alexandrescu
This is a good first step though, I'm happy to discuss this, but I think
discussion about the practical application may also reveal design
details at this level.
Absolutely. This is refreshing since I've gone through the cycle of "types
must be integrated into allocators and with the global GC" ... -> ...
"untyped allocators can actually be extricated" once. It is now great to
revisit the assumptions again.
One matter I'm rather surprised didn't come up (I assumed everyone would
be quite curious about it) is that the allocator does not store the
allocated size, or at least does not allow it to be discovered easily. This
is a definitely appropriate topic for the design at hand.
void* malloc(size_t size);
void free(void*);
Note that the user doesn't pass the size to free(), which means the
allocator is burdened with inferring the size of the block from the pointer
efficiently. Given that most allocators make crucial strategic choices
depending on the size requested, this API shortcoming is a bane of
allocator writers everywhere, and a source of inefficiency in virtually all
contemporary malloc-style allocators.
This is most ironic since there is next to nothing that user code can do
with a pointer without knowing the extent of memory addressed by it. (A
notable exception is zero-terminated strings, another questionable design
decision.) It all harkens back to Walter's claim of "C's biggest mistake"
(with which I agree) of not defining a type for a bounded memory region.
Upon studying a few extant allocators and D's lore, I decided that in D
things have evolved sufficiently to have the user pass the size upon
void[] allocate(size_t size);
void deallocate(void[] buffer);
This is because the size of D objects is naturally known: classes have it
in the classinfo, slices store it, and the few cases of using bald pointers
for allocation are irrelevant and unrecommended.
This all makes memory allocation in D a problem fundamentally simpler than
in C, which is quite an interesting turn :o).
Yeah, that's definitely cool.
I had previously just presumed the allocator would stash the size somewhere
along with it's allocation record (as usual), but this does potentially
simplify some allocators.

No, I certainly understand you mean 'them', but you lead to what I'm
Post by Manu
asking, how do these things get carried/passed around. Are they
discreet, or will they invade argument lists everywhere?
Since there's a notion of "the default" allocator, there will be ways to
push/pop user-defined allocators that temporarily (or permanently) replace
the default allocator. This stage of the design is concerned with allowing
users to define such user-defined allocators without having to implement
them from scratch.
I get that about this stage of design. Buy my question was about the next
stage, that is, usage/management of many instances of different allocators,
how they are carried around, and how they are supplied to things that want
to use them.
Prior comments lead me to suspect you were in favour of exclusive usage of
this new API, and the 'old' new keyword would continue to exist to just
allocate GC memory. At least that's the impression I got.
This leads to a presumption that parameter lists will become clogged with
allocators if they must be micro-managed, references to allocators will end
up all over the place. I'm not into that. I want to know how it will look
in practise.

I appreciate you think that's off-topic, but I find it pretty relevant.

Sure, but the common case is that the author will almost certainly use
Post by Manu
keyword 'new'. How can I affect that as a 3rd party?
This would require me overriding the global allocator somehow... which
you touched on earlier.
The way I'm thinking of this is to install/uninstall user-defined
allocators that will satisfy calls to new. Since not all allocators support
tracing, some would require the user to deallocate stuff manually.
So we need to keep delete then? I also think it may remain useful for
allocators that don't clean up automatically.

The library wouldn't need to worry as there would be the notion of a
Post by Manu
default allocator (probably backed by the existing GC).
Right. So it's looking like like the ability to override the global
allocator is a critical requirement.
Again, this is difficult to handle in the same conversation as "should we
pass size upon deallocation". Yes, we need road laws, but it's hard to talk
about those and engine lubrication in the same breath. For me,
"critical" right now is to assess whether the untyped API misses an
important category of allocators, what safety level it has (that's why
"expand" is so different from "realloc"!) etc.
Then we should fork the thread? Or I'll just wait for the next one?
I'm happy to do that, but as I said before, I suspect 'that' discussion
will have impact on this one though, so however awkward, I think it's
relevant.

And does that mean that applications+libraries are required to
Post by Manu
ALWAYS
allocate through given allocator objects?
Yes, they should.
Then we make keyword 'new' redundant?
Probably not. Typed allocators will need to integrate with (and
occasionally replace) the global shared GC.
'typed allocators'? Are they somehow fundamentally separate from this
discussion?
They seem like just a layer above which construct/destruct/casts. Do you
anticipate the typed allocation interface to be significantly more than
just a light wrapper?

The problem is that installing an allocator does not get to define
Post by Manu
what a pointer is and what a reference is.
Why not?
Because that requires a language change. I'm not sure you realize but you
move the goalposts all the time. We were talking within the context of
libraries and installing allocators dynamically and all of a sudden you get
to change what the compiler thinks a pointer is.
Deprecating new is definitely a language change.

You said (down lower) "have allocators define their own pointers and
reference types"... so _you_ said about changing what the compiler thinks a
pointer is, that wasn't my idea, I am just trying to roll with that train
of thought, and consider possibilities.

I don't think it's fair to accuse me of moving goal posts when it's
absolutely not clear where they are to begin with. I made the first reply
in this thread, at which point there was no conversation to go from.
I haven't declared any goal posts I'm aware of. I'm just humoring ideas,
and trying to adapt my thinking to your responses, and consider how it
affects my requirements. I'm also trying to read into vague comments about
practical usage that I can't properly visualise without examples.
This is a very important topic to me long-term. It is the source of
innumerable headache and mess throughout my career. D needs to get this
right.

I'm equally confused by your changing position on whether new is important,
should be deprecated, will support global 'new' overrides, assuming that
delete doesn't exist (rules out non-collecting allocators in conjunction
with the keyword allocators), allocation through an allocator object should
be encouraged as standard in user code, typed allocators will need to
integrate with (and occasionally replace) the global GC, or not.
Your position hasn't been static, I'm just trying to work out what it is,
and then consider if I'm happy with it.

You're implying I have a fixed position, I don't (I do have a few
implementation requirements I think are important though, however they end
out being expressed), I had no idea at all what you had in mind at the
start of the thread, I'm just trying to get clarification, and generally
throwing ideas in the pot.

What I do hope to get to is to have allocators define their own
Post by Manu
pointers and reference types. User code that uses those will be
guaranteed certain allocation behaviors.
Interesting, will this mangle the pointer type, or the object type being
pointed to? The latter is obviously not desirable. Does the former
actually work in theory?
I don't think I understand what you mean. Honest, it seems to me you're
confused about what you want, how it can be done, and what moving pieces
are involved.
Apparently I don't understand what you mean. What does "have allocators
define their own pointers and reference types" mean then?
I presumed you mean that an allocator would have some influence on the type
of the pointers they allocate, then it can be known how to handle them
throughout their life.

One example is Rust, which defines several pointer types to implement its
scoping and borrowing semantics. I think it's not easy to achieve its
purported semantics with fewer types, and time will tell whether
programmers will put up with the complexity for the sake of the
corresponding benefits.
I don't think rust allows user-defined allocators to declare new pointer
types, does it?

I think a critical detail to keep in mind, is that (I suspect) people
Post by Manu
simply won't use it if it doesn't interface with keyword 'new'.
I think this is debatable. For one, languages such as Java and C++ still
have built-in "new" but quite ubiquitously unrecommend their usage in user
code. Far as I can tell that's been a successful campaign.
I'm not sure what you mean... if by success you mean people don't use new
in C++, then yes, I have observed that pattern, and it's an absolute
catastrophe in my experience.
Every company/code base I've ever worked on has had some stupid new macro
they roll themselves. And that always leads to problems when interacting
with 3rd party libraries, and generic code in C++ can't work when there's
lots of different ways to allocate memory.

For two, there are allocations that don't even entail calls to new, such as
array concatenation.
This is a very important area of discussion; how will user-defined
allocators fit into implicit allocations?
Shall we split off a few threads so they can be considered 'on topic'?
There are a lot of things that need discussion.

It's extremely common that I want to enforce that a library exist
Post by Manu
entirely within a designated heap. It can't fall back to the global GC.
I work on platforms where memory is not unified. Different resources
need to go into different heaps.
The proposed API would make that entirely possible. You seem to care only
with the appendix "and mind you I only want to use new for all of those!"
which is a bit out there. But nevertheless good feedback.
You often "quote" people, but totally paraphrase them such that it's
nothing like what they said.
I'm sure I've made the point a whole bunch of times times that I care only
that "libraries don't hard-code an allocator which can not be overridden by
the application that uses them".
That may sound like "I want to use new for all those"; because libraries
invariably use new, and I can't control that. The reason for that is very
likely nothing more than 'because it's a keyword', it's also convenient,
and perceived as the standard/proper way to allocate memory.
I think that's water well under the bridge, probably in the ocean by now,
new is out there, I don't think it can be revoked.
So as far as I see it, unless you can present a practical solution to
deprecate it, and a path to migrate existing 'new' calls, then new must be
a part of the solution here; it must be overridable.

That seems to be the direction this thread is going anyway, so I'll drop
this thread now. But you can't accuse me of changing my story when I'm only
trying to clarify your story in the first place, which also seems to have
changed (in favour of keeping new judging by the direction of the thread,
and raising DIP46).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/bb68a98b/attachment-0001.html>
Jacob Carlborg
2013-09-24 06:32:10 UTC
Permalink
Post by Andrei Alexandrescu
I talked Walter's ear off a while ago at an ACCU conference about the
notion that reference counting could be a switch passed to the compiler.
Recently he's authored a DIP about the compiler inserting refcounting
primitives in the generated code. Unfortunately I think the DIP is
awfully incomplete and superficial (to the extent it would bring more
damage to the language if any implementation effort would be invested in
it at this point).
Can it be done? Probably. But it would be an absolutely major effort.
I don't know if we're talking about the same things but a couple of us
had a private email conversion about implementing ARC (Automatic
Reference Counting) in the compiler as a result of my suggestion for
adding support for Objective-C in D.

A DIP was never created, at least I cannot find it at the wiki. If I
recall correctly the conversation was never put in the public, which I
think it should be.
Post by Andrei Alexandrescu
I think this is debatable. For one, languages such as Java and C++ still
have built-in "new" but quite ubiquitously unrecommend their usage in
user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
--
/Jacob Carlborg
Andrei Alexandrescu
2013-09-24 15:29:39 UTC
Permalink
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I talked Walter's ear off a while ago at an ACCU conference about the
notion that reference counting could be a switch passed to the compiler.
Recently he's authored a DIP about the compiler inserting refcounting
primitives in the generated code. Unfortunately I think the DIP is
awfully incomplete and superficial (to the extent it would bring more
damage to the language if any implementation effort would be invested in
it at this point).
Can it be done? Probably. But it would be an absolutely major effort.
I don't know if we're talking about the same things but a couple of us
had a private email conversion about implementing ARC (Automatic
Reference Counting) in the compiler as a result of my suggestion for
adding support for Objective-C in D.
A DIP was never created, at least I cannot find it at the wiki. If I
recall correctly the conversation was never put in the public, which I
think it should be.
I know of this idea and discussed it with Walter. Our provisional
conclusion was that the matter is quite a bit more involved than we
initially thought, and ensuring safety would be extremely difficult.
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I think this is debatable. For one, languages such as Java and C++ still
have built-in "new" but quite ubiquitously unrecommend their usage in
user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom


Andrei
Paulo Pinto
2013-09-24 16:12:29 UTC
Permalink
Post by Andrei Alexandrescu
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I think this is debatable. For one, languages such as Java and C++ still
have built-in "new" but quite ubiquitously unrecommend their usage in
user code. Far as I can tell that's been a successful campaign.
Since when is it _not_ recommended to use "new" in Java?
http://goo.gl/KVmAom
Andrei
The top results are from around 2002, when most JVMs still had basic GC
implementations and did not perform escape analysis.


--
Paulo

Robert
2013-09-24 09:49:05 UTC
Permalink
Post by Andrei Alexandrescu
void deallocate(void[] buffer);
This is because the size of D objects is naturally known: classes have
it in the classinfo, slices store it, and the few cases of using bald
pointers for allocation are irrelevant and unrecommended.
One concern though: void[] is just a slice, who guarantees that the size
matches the one that was allocated?

Is the allocator assumed to handle this correctly:

auto arr = allocate(100);
auto arr2 = arr[50..100];
arr = arr[0..50];
deallocate(arr);
deallocate(arr2);

?

Or is the user simply obliged to use the originally returned range?

Best regards,

Robert
Andrei Alexandrescu
2013-09-24 16:05:28 UTC
Permalink
Post by Robert
Post by Andrei Alexandrescu
void deallocate(void[] buffer);
This is because the size of D objects is naturally known: classes have
it in the classinfo, slices store it, and the few cases of using bald
pointers for allocation are irrelevant and unrecommended.
One concern though: void[] is just a slice, who guarantees that the size
matches the one that was allocated?
auto arr = allocate(100);
auto arr2 = arr[50..100];
arr = arr[0..50];
deallocate(arr);
deallocate(arr2);
?
Or is the user simply obliged to use the originally returned range?
Initially I thought the user must pass the exact slice returned by
allocate(). Then I thought we can relax the requirement to any length
between the _requested_ size in allocate and the _received_ length of
the slice. For example:

auto arr = a.allocate(11);

Most allocators will allocate more than 11 bytes. Say the returned arr
has length 64. Then either a.deallocate(arr) or a.deallocate(arr[0 ..
11]) would work. In fact a.deallocate(arr[0 .. n]) would work for any n
between 11 and 64 inclusive. This is because the allocator will apply
whatever rounding mechanism it has applied to the size when allocating,
to the deallocated size.

If the size passed for deallocation is smaller than the size requested,
most allocators won't handle that properly. Interior pointers are also
not handled by most allocators.


Andrei
Walter Bright
2013-09-23 03:58:50 UTC
Permalink
Post by Andrei Alexandrescu
Destroy!
It should be compatible with:

http://wiki.dlang.org/DIP46

or the DIP should be adaptable to work with std.allocator.
Chad Joan
2013-09-24 04:53:24 UTC
Permalink
Post by Walter Bright
Post by Andrei Alexandrescu
Destroy!
http://wiki.dlang.org/DIP46
or the DIP should be adaptable to work with std.allocator.
It seems to me like DIP46 attempts to address Manu's concerns,
and is (mostly) orthogonal to Andrei's designs.

I would wish the current GC to be defined according to Andrei's
new API. It should be the one that handles all allocations by
default at program start.

I have a desire for region based allocation, and in this context
I want to be able to push the thread's default allocator onto a
"thread allocator stack" and thus make all D allocations (new's,
string concats, etc) use the region allocator. So I pretty much
want DIP46, except I want "pushAllocator(...)" and
"popAllocator()" instead of "gc_push()" and "gc_pop()". The
region allocator would also be defined according to Andrei's API.

By default, the region allocator should request giant chunks of
memory from the GC (or whatever is past it in the thread's
allocator stack). This should be manually adjustable so that I
can point it at a malloc-forwarding allocator (Mallocator).

The allocator stack at program start should probably look like
this:

[GCAllocator] <- current
[Mallocator]
<system>

My server might look like this:

// ----------------------
module main;
import core.thread;
import thread_main;

void main(string[] args)
{
... initialization/finalization (thanks scope(exit)!) ...

while ( true )
{
if ( !handshake(connection, ...) )
continue;

auto worker = new Thread(&threadMain);
worker.start();
}
}

// ----------------------
module thread_main;

import std.allocator;

void threadMain()
{
// Construct a region allocator that uses the nearest
// Mallocator to allocate regions.
auto regionAllocator = new RegionAllocator(getMallocator());

// Use it by default for all of this thread's allocations.
pushAllocator( regionAllocator );

scope(exit)
{
// Free 'foo' and everything still allocated by
// the call to doStuff().
regionAllocator.deallocateAll();

// Return control to the GC.
popAllocator();
}

auto foo = ubyte[40]; // Region allocated.

doStuff();
}

/// Gets the nearest Mallocator in the allocator stack, if it's
/// available. Returns the allocator stack's front if there is
/// none.
Mallocator getMallocator()
{
auto allocatorStackCopy = allocatorStack.save;
auto result = allocatorStackCopy.front;
while ( !allocatorStackCopy.empty )
{
// I just realized that allocators should be a class
// type and not a struct type, otherwise this function
// doesn't work!
// <failure>
// if (is(typeof(allocatorStackCopy.front) == Mallocator))
// </failure>

auto mallocator =
cast(Mallocator)allocatorStackCopy.front;

if ( mallocator !is null )
result = mallocator;
}

return result;
}

// ----------------------

Now, once inside doStuff(), the allocator stack looks like so:

[RegionaAllocator] -. <- Current
[GCAllocator] |
[Mallocator] <------'
<system>

// ----------------------

I think I might have managed to demonstrate that Manu's concerns
(and DIP46) have possible implications for Andrei's design:
allocators need dynamic dispatch to be reassignable at runtime,
and classes with inheritance seem like a natural choice for this.
I like my no-cost abstractions though, so I hate to admit the
dynamic nature of this. Is there a better way?

I definitely don't want 3rd parties to, at compile-time
(link-time?), avoid my custom allocators by default.

Small aside: I don't like that statelessOp has to be nothrow. Is
there a reason that we can't pre-allocate a fixed amount of
memory for exception objects, including at least one "can't
allocate proper exception" fatal exception, and then throw the
pre-allocated NoMemoryForException exception if we don't have
enough memory to handle the throw's that may occur?
Benjamin Thaut
2013-09-23 05:20:50 UTC
Permalink
Post by Andrei Alexandrescu
Hello,
2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
--
Kind Regards
Benjamin Thaut
Andrej Mitrovic
2013-09-23 11:39:48 UTC
Permalink
Post by Benjamin Thaut
Why "ubyte[]" and not "void[]"?
Probably to make the GC avoid scanning into the array.
Andrei Alexandrescu
2013-09-23 14:04:09 UTC
Permalink
Post by Benjamin Thaut
Post by Andrei Alexandrescu
Hello,
2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level.

ubyte[] == "these are octets"


Andrei
Manu
2013-09-23 14:07:34 UTC
Permalink
On 24 September 2013 00:04, Andrei Alexandrescu <
Post by Andrei Alexandrescu
Post by Benjamin Thaut
Post by Andrei Alexandrescu
Hello,
2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level.
ubyte[] == "these are octets"
Isn't that what void[] also means?
Except it says "these are un-typed octets, ie, not a sequence of typed
integers in the range 0-255".
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/99eb061f/attachment.html>
Andrei Alexandrescu
2013-09-23 14:16:27 UTC
Permalink
Post by Manu
On 24 September 2013 00:04, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
Hello,
2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level.
ubyte[] == "these are octets"
Isn't that what void[] also means?
Except it says "these are un-typed octets, ie, not a sequence of typed
integers in the range 0-255".
I think void[] means "objects of unknown type".

Andrei
Benjamin Thaut
2013-09-23 16:47:56 UTC
Permalink
Post by Andrei Alexandrescu
Post by Manu
On 24 September 2013 00:04, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
Hello,
2. Untyped allocator - traffics exclusively in ubyte[].
Why "ubyte[]" and not "void[]"?
It's the logical choice at this level.
ubyte[] == "these are octets"
Isn't that what void[] also means?
Except it says "these are un-typed octets, ie, not a sequence of typed
integers in the range 0-255".
I think void[] means "objects of unknown type".
Andrei
I always understood void[] as block of unkown data. Which a allocator
should return in my opinion. Whats the point of "void" having a size in
D if we still do it the C way? In my opinion ubyte[] is a array of
values in the range of 0-255 like manu says. Also if you get a ubyte[]
you might get the opinion that it is initialized to all zeros or
something. Which might not be true for all allocators (performance ...)
If you get a void[] you know, all bets are off, and you have to check if
the allocator preinitialized it or not.

Kind Regards
Benjamin Thaut
Andrei Alexandrescu
2013-09-23 17:08:58 UTC
Permalink
Post by Benjamin Thaut
I always understood void[] as block of unkown data. Which a allocator
should return in my opinion. Whats the point of "void" having a size in
D if we still do it the C way? In my opinion ubyte[] is a array of
values in the range of 0-255 like manu says. Also if you get a ubyte[]
you might get the opinion that it is initialized to all zeros or
something. Which might not be true for all allocators (performance ...)
If you get a void[] you know, all bets are off, and you have to check if
the allocator preinitialized it or not.
You might be right. For example, ubyte[] allows arithmetic on its
elements, which is something one shouldn't ever care to do in an
allocation library.

I'm unclear on what void[] means starting from its semantics. That said,
I replaced ubyte[] with void[] throughout my existing code and it works.


Andrei
Walter Bright
2013-09-23 20:04:49 UTC
Permalink
Post by Andrei Alexandrescu
I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a block of untyped
data. void[] should be ideal for a primitive allocator.
Andrei Alexandrescu
2013-09-23 20:18:56 UTC
Permalink
Post by Walter Bright
Post by Andrei Alexandrescu
I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a block of
untyped data. void[] should be ideal for a primitive allocator.
Great, made that change, it all works. Does void[] make any promise
about alignment?

Andrei
Walter Bright
2013-09-23 21:54:44 UTC
Permalink
Does void[] make any promise about alignment?
No.
deadalnix
2013-09-24 00:18:51 UTC
Permalink
On Monday, 23 September 2013 at 20:18:56 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
Post by Walter Bright
Post by Andrei Alexandrescu
I'm unclear on what void[] means starting from its semantics.
I agree that it should be void[], as that represents (in D) a
block of
untyped data. void[] should be ideal for a primitive allocator.
Great, made that change, it all works. Does void[] make any
promise about alignment?
Andrei
void[] dont make any promise about anything, that is the whole
point.
Walter Bright
2013-09-23 20:04:43 UTC
Permalink
Post by Benjamin Thaut
I always understood void[] as block of unkown data.
Untyped data, not unknown data.
Andrei Alexandrescu
2013-09-23 14:13:33 UTC
Permalink
Post by Andrej Mitrovic
Post by Benjamin Thaut
Why "ubyte[]" and not "void[]"?
Probably to make the GC avoid scanning into the array.
No, that's determined at runtime by how the blocks are allocated.

Andrei
Jacob Carlborg
2013-09-23 07:31:45 UTC
Permalink
Post by Andrei Alexandrescu
I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a whole new level.
I agree with Manu here. I thought the whole point was to come up with a
design and API how the allocators are supposed to be used. How they
integrate with user code.

Here's a quick idea:

I can think of at least four different usage patterns how an allocator
can be set. Hopefully this will be backwards compatible as well.

1. Globally - The same allocator is used in the whole application by all
threads

2. Thread local - The allocator is set per thread. Different threads can
used different allocators

3. Function local - This is the most intrusive. Sets the allocator for a
given function call

4. Block local - The allocator is used during a given block

This will require some interaction with the runtime and library and
probably use OOP. We define two module variables in some module in
druntime, say rt.allocator:

module rt.allocator:

shared Allocator globalAllocator;
Allocator allocator;

shared this ()
{
globalAllocator = GCAllocator.allocator;
}

The global allocator is, by default, a GC allocator (the GC we're
currently using). For each thread we set the thread local allocator to
be the same as the global allocator:

allocator = globalAllocator;

By default all code will use "new" to allocate memory. druntime will
have three functions which the "new" keyword is lowered to. They could
look something like this:

extern (C) ubyte[] _d_tl_new (size_t size)
{
return _d_new(size, rt.allocator.allocator);
}

extern (C) ubyte[] _d_global_new (size_t size)
{
return _d_new(size, rt.allocator.globalAllocator);
}

extern (C) ubyte[] _d_new (size_t size, Allocator allocator)
{
if (memory = allocator.allocate(size))
return memory;

onOutOfMemoryError();
}

By default "new" is lowered to a call to "_d_tl_new", which will
allocate using the thread local allocator, which is by default the same
as the global allocator, that is, the GC. In this way we maintain the
current method of allocating memory.

When using "new shared", it's lowered to a function call to
"_d_global_new", which uses the global allocator.

For block local allocator an ideal syntax would be:

allocator (GCAllocator.allocator)
{
// explicitly use the GC allocator within this block.
}

If that's not possibly we can define a function look like this:

useAllocator (alias allocator, alias block) ()
{
auto currentAllocator = core.allocator.allocator;
scope (exit)
core.allocator.allocator = currentAllocator;

block();
}

Which is used, something like this:

useAllocator!(GCAllocator.allocator, {
// explicitly use the GC allocator within this block.
});

Or alternately, using a struct:

struct UseAllocator
{
private Allocator currentAlloctor;

this (Allocator allocator)
{
currentAlloctor = core.allocator.allocator;
}

~this ()
{
rt.allocator.allocator = currentAlloctor;
}
}

UseAllocator useAllocator (Allocator allocator)
{
return UseAllocator(allocator);
}


{
auto _ = useAllocator(GCAllocator.allocator);
// explicitly use the GC allocator within this block.
}

Of curse it's also possible to explicitly set the thread local or global
allocator for a more fine grained control. The above functions are just
for convenience and to make sure the allocator is restored.

Function local allocator would just be a function taking an allocator as
a parameter, example:

void process (Allocator) (int a, Allocator allocator =
core.allocator.allocator)
{
auto _ = useAllocator(allocator);
// use the given allocator throughout the rest of this function scope
}
Post by Andrei Alexandrescu
struct NullAllocator
{
enum alignment = real.alignof;
enum size_t available = 0;
ubyte[] allocate(size_t s) shared { return null; }
bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
{ assert(b is null); return false; }
bool reallocate(ref ubyte[] b, size_t) shared
{ assert(b is null); return false; }
void deallocate(ubyte[] b) shared { assert(b is null); }
void collect() shared { }
void deallocateAll() shared { }
static shared NullAllocator it;
}
I would put the asserts in an "in" block.
Post by Andrei Alexandrescu
* "it" is an interesting artifact. Allocators may or may not hold
per-instance state. Those that don't are required to define a global
shared or thread-local singleton called "it" that will be used for all
calls related to allocation. Of course, it is preferred for maximum
flexibility that "it" is shared so as to clarify that the allocator is
safe to use among different threads. In the case of the NullAllocator,
this is obviously true, but non-trivial examples are the malloc-based
allocator and the global garbage collector, both of which implement
thread-safe allocation (at great effort no less).
You have to come up with a better name than "it". If it's some kind of
singleton instance then the standard name is usually "instance". Another
suggested would be "allocator".
--
/Jacob Carlborg
qznc
2013-09-23 09:31:51 UTC
Permalink
On Monday, 23 September 2013 at 07:31:46 UTC, Jacob Carlborg
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I am making good progress on the design of std.allocator, and
I am
optimistic about the way it turns out. D's introspection
capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist
allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a
whole new level.
I agree with Manu here. I thought the whole point was to come
up with a design and API how the allocators are supposed to be
used. How they integrate with user code.
I can think of at least four different usage patterns how an
allocator can be set. Hopefully this will be backwards
compatible as well.
1. Globally - The same allocator is used in the whole
application by all threads
2. Thread local - The allocator is set per thread. Different
threads can used different allocators
3. Function local - This is the most intrusive. Sets the
allocator for a given function call
4. Block local - The allocator is used during a given block
5. Class local - The allocator is used for specific types (e.g.
ASTNode in a compiler)

6. Class-hierarchy - The allocator is used for a specific type
hierarchy (e.g. ASTNode might have sub classes
Statement,BinOp,etc)

7. Container local - The C++ way which binds allocation to a
wrapping container

8. Task local - like Thread local but for std.parallelism.Task

That's it for now.

This is quite a long list, which means it is probably exhaustive.
There should be a generic approach which encompasses at least
those cases.
Jacob Carlborg
2013-09-23 13:38:33 UTC
Permalink
5. Class local - The allocator is used for specific types (e.g. ASTNode
in a compiler)
6. Class-hierarchy - The allocator is used for a specific type hierarchy
(e.g. ASTNode might have sub classes Statement,BinOp,etc)
7. Container local - The C++ way which binds allocation to a wrapping
container
8. Task local - like Thread local but for std.parallelism.Task
That's it for now.
This is quite a long list, which means it is probably exhaustive. There
should be a generic approach which encompasses at least those cases.
That's a good addition to the list.
--
/Jacob Carlborg
Manu
2013-09-23 14:05:11 UTC
Permalink
5. Class local - The allocator is used for specific types (e.g. ASTNode
Post by qznc
in a compiler)
6. Class-hierarchy - The allocator is used for a specific type hierarchy
(e.g. ASTNode might have sub classes Statement,BinOp,etc)
7. Container local - The C++ way which binds allocation to a wrapping
container
8. Task local - like Thread local but for std.parallelism.Task
That's it for now.
This is quite a long list, which means it is probably exhaustive. There
should be a generic approach which encompasses at least those cases.
That's a good addition to the list.
Another situation that I've encountered on a few occasions...
Imagine I declare a particular allocator for my application, when dealing
with 3rd party libs, I expect it to allocate within my application's heap.
Seems like a situation where I might set a global allocator...
Mr 3rd party library also has its own allocators though, things like pools
or groups which it uses explicitly within it's code.
I don't actually want to override these allocators, since they're
effectively a higher-level construct, but I *do* want to override these
allocator's memory source. Ie, they should allocate their pools from my
application's heap, rather than the default heap.
So in this way, it's important to be able to set overrides at multiple
levels.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/03a490fb/attachment-0001.html>
Jacob Carlborg
2013-09-23 18:56:33 UTC
Permalink
Post by Manu
Another situation that I've encountered on a few occasions...
Imagine I declare a particular allocator for my application, when
dealing with 3rd party libs, I expect it to allocate within my
application's heap. Seems like a situation where I might set a global
allocator...
Mr 3rd party library also has its own allocators though, things like
pools or groups which it uses explicitly within it's code.
I don't actually want to override these allocators, since they're
effectively a higher-level construct, but I *do* want to override these
allocator's memory source. Ie, they should allocate their pools from my
application's heap, rather than the default heap.
So in this way, it's important to be able to set overrides at multiple
levels.
You might be able to combine two different allocators. Basically create
a new allocator that somehow extends the third party allocator but
allocates in your own heap instead.
--
/Jacob Carlborg
Andrei Alexandrescu
2013-09-23 14:10:09 UTC
Permalink
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a whole new level.
I agree with Manu here. I thought the whole point was to come up with a
design and API how the allocators are supposed to be used. How they
integrate with user code.
This is a disconnect. I say "Well, I'm exploring car engines and this
Otto cycle looks promising, here's how a transmission would work, and
the brake system would be hydraulic..." and you say "but I want to know
how to drive a car, and road regulations, and laws!"

Both are important, but they are very difficult to handle simultaneously
in the same conversation.


Andrei
Dicebot
2013-09-23 08:03:36 UTC
Permalink
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu
...
In general I like it though I do agree with concerns mentioned
that without speculation about high-level API and usage examples
it is rather hard to evaluate its practical applicability. For
example, you have mentioned somewhere in comments that it is
planned to embed allocator as part of template site like in C++
and it does not really sound a good idea.
Andrei Alexandrescu
2013-09-23 14:12:30 UTC
Permalink
...
In general I like it though I do agree with concerns mentioned that
without speculation about high-level API and usage examples it is rather
hard to evaluate its practical applicability. For example, you have
mentioned somewhere in comments that it is planned to embed allocator as
part of template site like in C++ and it does not really sound a good idea.
That is undecided at this point. It's possible that we collectively get
convinced the efficiency impact of indirect calls is not too high, in
which case we can have containers using a dynamic allocator interface.

Andrei
ponce
2013-09-23 14:22:59 UTC
Permalink
Great news! It looks like a big improvement on akward C++
allocators.

(For what it's worth I have a working implementation of aligned
malloc/free/realloc here
https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which
can be the basis for an allocator layered upon another)
Andrei Alexandrescu
2013-09-23 14:54:39 UTC
Permalink
Great news! It looks like a big improvement on akward C++ allocators.
(For what it's worth I have a working implementation of aligned
malloc/free/realloc here
https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which can be
the basis for an allocator layered upon another)
Awesome, thanks. Will look at it.

Andrei
Andrei Alexandrescu
2013-09-23 15:45:25 UTC
Permalink
Great news! It looks like a big improvement on akward C++ allocators.
(For what it's worth I have a working implementation of aligned
malloc/free/realloc here
https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d, which can be
the basis for an allocator layered upon another)
I gave this a read, nice work.

One question: what circumstances require run-time alignment values, and
what values would those be? I'm currently under the assumption that
alignments are known during compilation.


Thanks,

Andrei
ponce
2013-09-23 15:56:10 UTC
Permalink
On Monday, 23 September 2013 at 15:45:25 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
Post by ponce
Great news! It looks like a big improvement on akward C++
allocators.
(For what it's worth I have a working implementation of aligned
malloc/free/realloc here
https://github.com/p0nce/gfm/blob/master/gfm/core/memory.d,
which can be
the basis for an allocator layered upon another)
I gave this a read, nice work.
One question: what circumstances require run-time alignment
values, and what values would those be? I'm currently under the
assumption that alignments are known during compilation.
Thanks,
Andrei
I don't know of a use case for run-time alignment values.
Adam D. Ruppe
2013-09-23 15:02:09 UTC
Permalink
We should really deprecate the new keyword. It'd break like all
code ever, but with D's templates, there's no need for it, and
when it is there, it is going to spark problems about replacing
global allocators or the library allocators being second class
citizens.

Maybe we could minimize the breakage by just rewriting the
keyword into a library function call, which could then be
overridden with local variables or imports via normal scoping.
Andrei Alexandrescu
2013-09-23 15:03:25 UTC
Permalink
We should really deprecate the new keyword. It'd break like all code
ever, but with D's templates, there's no need for it, and when it is
there, it is going to spark problems about replacing global allocators
or the library allocators being second class citizens.
Maybe we could minimize the breakage by just rewriting the keyword into
a library function call, which could then be overridden with local
variables or imports via normal scoping.
I'd think new already is translated into a library call. Walter?

Andrei
Jacob Carlborg
2013-09-23 18:59:22 UTC
Permalink
Post by Andrei Alexandrescu
I'd think new already is translated into a library call. Walter?
Yes, "new" is lowered to a couple different runtime functions. Here's a
list:

http://wiki.dlang.org/Runtime_Hooks
--
/Jacob Carlborg
Jacob Carlborg
2013-09-23 18:59:40 UTC
Permalink
Post by Jacob Carlborg
http://wiki.dlang.org/Runtime_Hooks
Search for "new".
--
/Jacob Carlborg
Andrei Alexandrescu
2013-09-23 19:13:47 UTC
Permalink
Post by Jacob Carlborg
Post by Andrei Alexandrescu
I'd think new already is translated into a library call. Walter?
Yes, "new" is lowered to a couple different runtime functions. Here's a
http://wiki.dlang.org/Runtime_Hooks
Thanks, this is great.

Currently "new" is fail because it calls functions passing TypeInfo
objects, instead of calling type-parameterized functions. We must change
that to have "new T(args)" lower into ".opNew!T(args)". Then object.d
defines that function.


Andrei
Benjamin Thaut
2013-09-23 17:04:43 UTC
Permalink
We should really deprecate the new keyword. It'd break like all code
ever, but with D's templates, there's no need for it, and when it is
there, it is going to spark problems about replacing global allocators
or the library allocators being second class citizens.
Its not possible to replace new with a library function in all cases.
There are many examples where it breaks (I tried it believe me). Just
let me give a few:


class A
{
class B
{
}

B m_b;

this()
{
// uses the sourrounding context
// not possible with library function
m_b = new B();
}
}

Also there is the problem that D does not have perfect forwarding. That
means if "new" is a library function it will cause additional copies /
moves of structs which are not the case with the buildin new. Next there
are implict conversions of literals. Those work just fine with the
buildin new, but will break with a templated library new.

E.g.

class C
{
this(ubyte a, ubyte b)
{
}
}

new C(5,3); // fine
New!C(5,3); // Error: can not implicitly convert int to ubyte

Unless of course you want to write a template metaprogram that does
those implict conversions. Although the problem would remain that you
can't know if the value you get comes from a literal or from a function
call, variable, etc.

The next thing are arrays. While you get the nice array syntax with the
buildin new, a library new just looks ugly.

new int[5];
vs.
NewArray!int(5); // Can't use New! here because it would conflict with
the New! for creating objects / structs

I'm using library implemented new / delete since over a year and its
annoying and makes the language look ugly and feel strange to use. See
my allocator and New / Delete implementation here:

https://github.com/Ingrater/druntime/blob/master/src/core/allocator.d

I would rather want new to be overloadable and have 2 sets of parameters

new (allocator)(arg1, arg2)

Where "allocator" would go to the overloaded version of new and "arg1"
and "arg2" will be passed to the constructor.

I think its a really bad idea to deprecate new.

Replacing Delete works just fine with a library function though.

Kind Regards
Benjamin Thaut
Nick Sabalausky
2013-09-23 23:29:39 UTC
Permalink
On Mon, 23 Sep 2013 17:02:09 +0200
Post by Adam D. Ruppe
We should really deprecate the new keyword. It'd break like all
code ever, but with D's templates, there's no need for it, and
when it is there, it is going to spark problems about replacing
global allocators or the library allocators being second class
citizens.
I think that's addressing the wrong problem, it wouldn't solve much,
if anything. We'd just end up with a lot of lib writers (and others)
hardcoding in usage of the default allocator. So it'd be exactly the
same as now, just with uglier syntax and a ton of breakage.

What's ultimately needed is a way to change the default allocator for
not only "new" but also array concatenation, closures, and any other
part of the language that might allocate. That way, we actually *do* get
the benefits we're looking for, plus we keep the nicer current syntax
and no breakage.

tl;dr: "new" is a red herring. The real issue is being able to easily
replace the default allocator.
H. S. Teoh
2013-09-24 00:03:00 UTC
Permalink
Post by Nick Sabalausky
On Mon, 23 Sep 2013 17:02:09 +0200
Post by Adam D. Ruppe
We should really deprecate the new keyword. It'd break like all
code ever, but with D's templates, there's no need for it, and
when it is there, it is going to spark problems about replacing
global allocators or the library allocators being second class
citizens.
I think that's addressing the wrong problem, it wouldn't solve much,
if anything. We'd just end up with a lot of lib writers (and others)
hardcoding in usage of the default allocator. So it'd be exactly the
same as now, just with uglier syntax and a ton of breakage.
What's ultimately needed is a way to change the default allocator for
not only "new" but also array concatenation, closures, and any other
part of the language that might allocate. That way, we actually *do* get
the benefits we're looking for, plus we keep the nicer current syntax
and no breakage.
tl;dr: "new" is a red herring. The real issue is being able to easily
replace the default allocator.
I thought Walter's DIP already addresses the issue of replacing the
default allocator?

http://wiki.dlang.org/DIP46

I get the feeling that we don't have a good handle on the fundamental
issues, though. Having a stack for managing the default allocator works
for the simpler cases, but it's unclear how things would interact once
you throw in other requirements, like class/struct-scoped allocators,
block scoped allocators inside closures, user-specified allocators that
must interact with the foregoing, etc..

For example, it's relatively easy to understand this:

void doWork() {
gc_push(MyAlloc);
scope(exit) gc_pop();
... // do work
}
void doMoreWork() {
gc_push(AnotherAlloc);
scope(exit) gc_pop();
... // do work
}
void main() {
doWork();
doMoreWork();
}

But once you throw in closures and delegates, things become rather murky:

auto getCallback() {
gc_push(MyAlloc);
scope(exit) gc_pop();

int closed_var;
return {
// What does this do?
closed_var++;

string x = "abc";
x ~= "def"; // which allocator does this use?
};
}
void main() {
auto dg = getCallback();
gc_push(OtherAlloc);
scope(exit) gc_pop();
dg(); // Hmm...
}


T
--
May you live all the days of your life. -- Jonathan Swift
Jacob Carlborg
2013-09-24 06:35:57 UTC
Permalink
Post by H. S. Teoh
I thought Walter's DIP already addresses the issue of replacing the
default allocator?
http://wiki.dlang.org/DIP46
I get the feeling that we don't have a good handle on the fundamental
issues, though. Having a stack for managing the default allocator works
for the simpler cases, but it's unclear how things would interact once
you throw in other requirements, like class/struct-scoped allocators,
block scoped allocators inside closures, user-specified allocators that
must interact with the foregoing, etc..
Why not just let the user set the allocator explicitly, see:

http://forum.dlang.org/thread/l1nvn4$ca3$1 at digitalmars.com?page=2#post-l1oqp2:2429fg:241:40digitalmars.com
--
/Jacob Carlborg
deadalnix
2013-09-23 17:01:46 UTC
Permalink
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
Hello,
First of all, awesome !

Now the meeeeh part.

I really think the it thing is not good. I don't think it is
desirable or necessary. We should get rid of it.

You can't deal with ubyte[] like that, that is incorrect in
regard to - unimplemented - aliasing rules. Allocator should deal
with void[] .

What you call safe really isn't. Allocate something on the GC,
store a pointer on a custom allocated location, collect, enjoy
the memory corruption. All operation are safe according to your
proposal. Allocation can only be safe if the GRAND MASTER GC is
aware of it.

You proposal allocate shared memory. This is common in C/C++
world as memory is shared by default, but shouldn't be in D. It
is highly desirable to allocate with different methods for
different type qualifier. How does your design adapt to that ?

Finally, we got to decide how these basics block are used to form
typed allocators, and interact with language constructs.

Sorry if this has been mentioned before, it is really ate here
and I can't read the whole thread, especially since Manu is on
steroids :D
Andrei Alexandrescu
2013-09-23 17:16:40 UTC
Permalink
Post by deadalnix
Post by Andrei Alexandrescu
Hello,
First of all, awesome !
Now the meeeeh part.
I really think the it thing is not good. I don't think it is desirable
or necessary. We should get rid of it.
The singleton allocator "it" is instrumental for supporting stateful and
stateless allocators with ease. It took me a few iterations to get to
that, and I'm very pleased with the results. I think it would be
difficult to improve on it without making other parts more difficult.
Post by deadalnix
You can't deal with ubyte[] like that, that is incorrect in regard to -
unimplemented - aliasing rules. Allocator should deal with void[] .
I think I'll do that.
Post by deadalnix
What you call safe really isn't. Allocate something on the GC, store a
pointer on a custom allocated location, collect, enjoy the memory
corruption.
I don't understand this. There are no pointers at this level, only
untyped memory. The main chance of corruption is to access something
after it's been freed.
Post by deadalnix
All operation are safe according to your proposal.
No, for most allocators freeing and reallocating are unsafe.
Post by deadalnix
Allocation can only be safe if the GRAND MASTER GC is aware of it.
I don't think so.
Post by deadalnix
You proposal allocate shared memory.
No. It allocates unshared memory off of a shared or unshared allocator.
The memory just allocated is as of yet unshared for the simple reason
that only one thread has as of yet access to it.
Post by deadalnix
This is common in C/C++ world as
memory is shared by default, but shouldn't be in D. It is highly
desirable to allocate with different methods for different type
qualifier. How does your design adapt to that ?
The typed allocators will have distinct methods for shared and unshared
allocation. Even at this level it's possible to design an allocator that
allocates in different ways with a shared vs. an unshared allocator
object (overload on shared). So far I've only designed non-shared
allocators, or wrap those that are already shared (malloc and new).
Post by deadalnix
Finally, we got to decide how these basics block are used to form typed
allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint the
road.


Andrei
Robert Schadek
2013-09-23 18:55:37 UTC
Permalink
Post by Andrei Alexandrescu
Post by deadalnix
Finally, we got to decide how these basics block are used to form typed
allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint
the road.
IMO there is a problem with this metaphor. If the road is painted
metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I
think the overall picture has to be kept in sight.
Andrei Alexandrescu
2013-09-23 19:11:05 UTC
Permalink
Post by Robert Schadek
Post by Andrei Alexandrescu
Post by deadalnix
Finally, we got to decide how these basics block are used to form typed
allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint
the road.
IMO there is a problem with this metaphor. If the road is painted
metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I
think the overall picture has to be kept in sight.
Which makes the metaphor unintentionally better. Yes, we do need to
worry about higher concerns because that's how layerd abstractions work.

Andrei
Robert Schadek
2013-09-23 19:34:26 UTC
Permalink
Post by Andrei Alexandrescu
Post by Robert Schadek
Post by Andrei Alexandrescu
Post by deadalnix
Finally, we got to decide how these basics block are used to form typed
allocators, and interact with language constructs.
Sure. Again: I describe the Otto cycle and you discuss how to paint
the road.
IMO there is a problem with this metaphor. If the road is painted
metallic the Otto motor (cycle) needs cooling, otherwise ... boom. So I
think the overall picture has to be kept in sight.
Which makes the metaphor unintentionally better. Yes, we do need to
worry about higher concerns because that's how layerd abstractions work.
Andrei
You're right! And I'm to tired to read properly.
deadalnix
2013-09-24 00:12:26 UTC
Permalink
On Monday, 23 September 2013 at 17:16:40 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
The singleton allocator "it" is instrumental for supporting
stateful and stateless allocators with ease. It took me a few
iterations to get to that, and I'm very pleased with the
results. I think it would be difficult to improve on it without
making other parts more difficult.
The allocator can use a singleton internally without using "it".
I'm unclear what is the problem solved.
Post by Andrei Alexandrescu
Post by deadalnix
What you call safe really isn't. Allocate something on the GC, store a
pointer on a custom allocated location, collect, enjoy the
memory
corruption.
I don't understand this. There are no pointers at this level,
only untyped memory. The main chance of corruption is to access
something after it's been freed.
If there are no pointers at this level, then the collect method
do not make any sense.
Post by Andrei Alexandrescu
Post by deadalnix
Allocation can only be safe if the GRAND MASTER GC is aware of it.
I don't think so.
When GRAND MASTER GC unhappy, he free live objects.
Andrei Alexandrescu
2013-09-24 01:08:37 UTC
Permalink
Post by Andrei Alexandrescu
The singleton allocator "it" is instrumental for supporting stateful
and stateless allocators with ease. It took me a few iterations to get
to that, and I'm very pleased with the results. I think it would be
difficult to improve on it without making other parts more difficult.
The allocator can use a singleton internally without using "it". I'm
unclear what is the problem solved.
Composability. My first design had stateful allocators define regular
methods and stateless (better said global) allocators define static
methods. That seemed to work well but then allocators composed with them
also had to define static or nonstatic methods depending on whether the
parent allocators had state.

Using a singleton instance for stateless allocators has simplified
things somewhat - everybody defines regular methods, and if they are
stateless they define the singleton instance.


Andrei
BLM768
2013-09-23 18:02:55 UTC
Permalink
Post by deadalnix
What you call safe really isn't. Allocate something on the GC,
store a pointer on a custom allocated location, collect, enjoy
the memory corruption. All operation are safe according to your
proposal. Allocation can only be safe if the GRAND MASTER GC is
aware of it.
I don't see why the GC can't be allocator-agnostic. In principle,
it should be possible to register arbitrary allocators with the
GC (possibly by storing a pointer to the allocator in each
GC-managed block of memory). The GC would have a default
allocator which would be used with the "new" operator, and
something like the aforementioned create!() syntax could be used
for custom allocators:

SomeClass c1 = new SomeClass("arg");
//Managed by the GC
SomeClass c2 = gcCreate!SomeClass(someAllocator, "arg");
//Another possible syntax (much prettier!):
SomeClass c3 = someAllocator.new SomeClass("arg");

This could also integrate with DIP46; instead of pushing and
popping a GC object, one would push and pop the GC's default
allocator (for the current thread, of course).

There would be a bit of overhead, but that seems to be
unavoidable in a sufficiently generic design, and I doubt that it
would cut performance to an unacceptable level, especially
because really performance-critical code would avoid the GC
anyway.
Post by deadalnix
You proposal allocate shared memory. This is common in C/C++
world as memory is shared by default, but shouldn't be in D. It
is highly desirable to allocate with different methods for
different type qualifier. How does your design adapt to that ?
If the aforementioned thread-local GC reference is implemented,
it should be possible to provide decent thread-local allocation.
I'm not sure exactly how to design that, but it seems like
thread-local GC allocation would require some work at the GC
level; a global GC won't necessarily play nicely with
thread-local allocators.

For non-GC thread-local allocations, I'd imagine that you could
just create a thread-local allocator.
Timon Gehr
2013-09-23 20:32:11 UTC
Permalink
Post by Andrei Alexandrescu
I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a whole new level.
...
Seems neat.
Post by Andrei Alexandrescu
Several details are still in flux, but at the top level it seems most
1. Typed allocator, i.e. every request for allocation comes with the
exact type requested;
2. Untyped allocator - traffics exclusively in ubyte[].
...
Some general remarks:

One issue you will probably run into and maybe want to fix in some way
during the typed allocator design is that private constructors cannot be
called from templates parameterized on types.

E.g.:

module a;

auto New(T,Args...)(Args args){
return new T(args);
}

// ---

module b;

class A{
private this(){}
}

void main(){
auto a = New!A(); // error
}

Is there an intention to also support replacements of instance new
allocations, i.e. object.new subobject(args)?
Andrei Alexandrescu
2013-09-23 21:15:44 UTC
Permalink
Post by Timon Gehr
One issue you will probably run into and maybe want to fix in some way
during the typed allocator design is that private constructors cannot be
called from templates parameterized on types.
Hrm, that's a real problem. We've ran into it at work when we were
trying to replace a macro with a template. The macro is automatically
friend with everyone! :o)
Post by Timon Gehr
Is there an intention to also support replacements of instance new
allocations, i.e. object.new subobject(args)?
It's a good point to keep in mind as we move further with this.


Andrei
Jacob Carlborg
2013-09-24 06:45:40 UTC
Permalink
Post by Timon Gehr
One issue you will probably run into and maybe want to fix in some way
during the typed allocator design is that private constructors cannot be
called from templates parameterized on types.
module a;
auto New(T,Args...)(Args args){
return new T(args);
}
// ---
module b;
class A{
private this(){}
}
void main(){
auto a = New!A(); // error
}
Allocate the memory needed for T without using "new". Then a pointer can
be used to bypass the protection of the constructor, something like:

extern (C) Object _d_newclass(in ClassInfo);

T New (T, Args ...) (Args args)
{
auto object = cast(T) _d_newclass(T.classinfo);

// use explicit type to handle overloads.
T delegate (Args) dg = &object.__ctor;

return dg(args);
}
--
/Jacob Carlborg
Peter Alexander
2013-09-23 20:37:29 UTC
Permalink
On Sunday, 22 September 2013 at 23:49:56 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
1. Typed allocator, i.e. every request for allocation comes
with the exact type requested;
2. Untyped allocator - traffics exclusively in ubyte[].
This is a good design.

Perhaps this is a later concern, but we should make sure that
node-based containers (e.g. linked list, trees) have a way to
access the allocation size needed for the node. STL did not do
this.
Post by Andrei Alexandrescu
* alignment is a compile-time constant yielding the alignment
of allocated data. Many allocators offer the maximum alignment
of the platform (for which I used real.alignof above). This
constant is required for all allocators.
What if, for example, I wanted to allocate a 4096 byte aligned
block from the GC allocator? Do I have to create a new allocator
backed by the GC allocator?

What if the alignment is not known at compile time (e.g. hard
disk page size or CPU cache line size)?

Might be better to pass the desired alignment in the allocate
method.
Post by Andrei Alexandrescu
* available is a property that returns how many total (not
necessarily contiguous) bytes are available for allocation. The
NullAllocator knows statically it has 0 bytes available so it
implements it as an enum. Generally allocators will implement
It would be useful to know the maximum available contiguous block
size too, so that you can find out if an allocation will succeed
without calling allocate. It's also useful for diagnosing
fragmentation issues e.g. "allocation failed, free memory = X,
max contiguous = Y". If X is high and Y is low then you are
highly fragmented.

Of course, this should be optional.
Post by Andrei Alexandrescu
* allocate(s) returns a ubyte[] with length at least s, or
null. (It does not throw on out-of-memory because that would
hurt composability; it turns out many elemental allocators do
naturally run out of memory.) This method is required for all
What are the semantics of allocate(0)? malloc(0) is
implementation defined.
Andrei Alexandrescu
2013-09-23 21:27:36 UTC
Permalink
What if, for example, I wanted to allocate a 4096 byte aligned block
from the GC allocator? Do I have to create a new allocator backed by the
GC allocator?
I've been thinking a good library component would be AlignmentAllocator
that would work like this:

struct AlignmentAllocator(Allocator, size_t desiredAlignment)
if (Allocator.alignment < desiredAlignment)
{
enum alignment = desiredAlignment;
...
}

That allocator would allocate more memory (I suspect there's a gcd
calculation somewhere :o)) and then adjust the starting pointer of the
allocated block to reach the requested alignment.

I'm just off the phone with Walter discussing this and related issue.
It's an interesting problem space.
What if the alignment is not known at compile time (e.g. hard disk page
size or CPU cache line size)?
The proposed design assumes compile-time allocation sizes. A trick
similar to the one above (overallocate and adjust pointer) should work.

I'd need a handful of good examples where the alignment must be known at
runtime. Can you link to some?
Might be better to pass the desired alignment in the allocate method.
The thing is in 99%+ of the cases you don't need it. Then perhaps in
99%+ of the remaining cases the alignment is known during compilation.
Nevertheless it's a change worth investigating.
Post by Andrei Alexandrescu
* available is a property that returns how many total (not necessarily
contiguous) bytes are available for allocation. The NullAllocator
knows statically it has 0 bytes available so it implements it as an
property is optional.
It would be useful to know the maximum available contiguous block size
too, so that you can find out if an allocation will succeed without
calling allocate.
I think the best way is to just go ahead and attempt an allocation,
especially if the allocator is shared (races!). (There's some
flexibility with expand() there, which has a minimum size and a desired
size.) But clearly the information of the largest contiguous block is
currently missing, and is reasonable to be considered for addition.
It's also useful for diagnosing fragmentation issues
e.g. "allocation failed, free memory = X, max contiguous = Y". If X is
high and Y is low then you are highly fragmented.
Of course, this should be optional.
Makes sense.
Post by Andrei Alexandrescu
* allocate(s) returns a ubyte[] with length at least s, or null. (It
does not throw on out-of-memory because that would hurt composability;
it turns out many elemental allocators do naturally run out of
memory.) This method is required for all allocators. In most
What are the semantics of allocate(0)? malloc(0) is implementation defined.
I defined Mallocator to return whatever malloc returns on allocate(0),
but decided realloc() is too messed up and special-cased reallocate(0)
to free memory and place null in the buffer. Probably I'll work the same
logic in Mallocator.allocate(0) to eliminate platform dependencies.


Andrei
Peter Alexander
2013-09-23 22:12:05 UTC
Permalink
On Monday, 23 September 2013 at 21:27:36 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu
That allocator would allocate more memory (I suspect there's a
gcd calculation somewhere :o)) and then adjust the starting
pointer of the allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment
sizes.

Also, how does it work with your deallocate interface? Suppose I
request an 0x100 aligned block of 0x100 bytes. Your alignment
allocator requests 0x200 from the GC, which maybe returns
[0xffff0040-0xffff0240] and then returns an aligned buffer from
that [0xffff0100-0xffff0200]. Later, I try to deallocate that
buffer, which means your alignment allocator has to deallocate
the original buffer. Where does the alignment allocator store the
original GC-provided buffer ptr + size? It cannot be determined
from the buffer I gave it.
Post by Andrei Alexandrescu
I'd need a handful of good examples where the alignment must be
known at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page
size, which isn't constant.

http://linux.die.net/man/2/mprotect


Reading a file without buffering on Windows requires that you
align the destination buffer on volume sector size boundaries,
which is not known at compile time (although many just assume
4096).

http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx


As I mentioned previously, you may want to also align to the
cache line size (for performance). This is not known at compile
time (although again, is often assumed in practice).
Andrei Alexandrescu
2013-09-23 23:21:21 UTC
Permalink
Post by Andrei Alexandrescu
That allocator would allocate more memory (I suspect there's a gcd
calculation somewhere :o)) and then adjust the starting pointer of the
allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment sizes.
I don't see a way around it unless the allocator natively provides a
large alignment size, which it would then advertise in its "alignment"
constant. The problem is independent of this particular design.
Also, how does it work with your deallocate interface? Suppose I request
an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests
0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then
returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I
try to deallocate that buffer, which means your alignment allocator has
to deallocate the original buffer. Where does the alignment allocator
store the original GC-provided buffer ptr + size? It cannot be
determined from the buffer I gave it.
Walter noted the same issue. I'd need to think about it. A simple
solution is to simply store the size just before the pointer returned,
in a dope block manner.
Post by Andrei Alexandrescu
I'd need a handful of good examples where the alignment must be known
at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page size,
which isn't constant.
http://linux.die.net/man/2/mprotect
Thanks! I see posix_memalign is a corresponding call that returns a
page-aligned chunk of memory. I assume calls like mmap also return
page-aligned chunks. Indeed it is a problem for the current design that
the alignment must be known during compilation.
Reading a file without buffering on Windows requires that you align the
destination buffer on volume sector size boundaries, which is not known
at compile time (although many just assume 4096).
http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx
I see there's a difference between x86 and Itanium in page size...

I'm trying to lean toward handling these as rare/exotic cases without
outright eliminating them, while providing simple and efficient handling
of alignment for the frequent cases (i.e. allocate ints at addresses
multiple of 4 etc).
As I mentioned previously, you may want to also align to the cache line
size (for performance). This is not known at compile time (although
again, is often assumed in practice).
Indeed.


Andrei
Manu
2013-09-24 04:56:33 UTC
Permalink
On 24 September 2013 09:21, Andrei Alexandrescu <
Post by Andrei Alexandrescu
That allocator would allocate more memory (I suspect there's a gcd
calculation somewhere :o)) and then adjust the starting pointer of the
allocated block to reach the requested alignment.
That's quite an inefficient use of memory for large alignment sizes.
I don't see a way around it unless the allocator natively provides a large
alignment size, which it would then advertise in its "alignment" constant.
The problem is independent of this particular design.
There's an important distinction between an allocator advertising a large
allocation size (which it always uses), and an allocator being capable of
allocating a larger aligned block if requested.
Surely the advertised alignment is the alignment you will get in all cases,
even if you request a small alignment.
You should advertise a minimum and maximum alignment if that is your
solution.

Also, how does it work with your deallocate interface? Suppose I request
an 0x100 aligned block of 0x100 bytes. Your alignment allocator requests
0x200 from the GC, which maybe returns [0xffff0040-0xffff0240] and then
returns an aligned buffer from that [0xffff0100-0xffff0200]. Later, I
try to deallocate that buffer, which means your alignment allocator has
to deallocate the original buffer. Where does the alignment allocator
store the original GC-provided buffer ptr + size? It cannot be
determined from the buffer I gave it.
Walter noted the same issue. I'd need to think about it. A simple solution
is to simply store the size just before the pointer returned, in a dope
block manner.
This means if the allocator returns you memory that is already of the
desired alignment, you need to waste an entire aligned block just to store
the offset, which would have been 0.
I solve this problem by storing my allocation table in a separate hash
table, and entries in that table associate the size and also the alignment
offset. But it's not a good solution, it's still wasteful. It's just the
best option you have if you don't have any control over the system
allocator.

I also use large alignments which I've detailed in prior discussions.
Common ones are cache line (~64-256 bytes), and gpu memory page (~4k).
You can't go wasting GPU memory by overallocating every block.
It's definitely important that allocator's are able to receive an alignment
request, and give them the opportunity to fulfill a dynamic alignment
request without always resorting to an over-allocation strategy.

I'd need a handful of good examples where the alignment must be known
Post by Andrei Alexandrescu
at runtime. Can you link to some?
mprotect requires a pointer that is a multiple of the system page size,
which isn't constant.
http://linux.die.net/man/2/**mprotect<http://linux.die.net/man/2/mprotect>
Thanks! I see posix_memalign is a corresponding call that returns a
page-aligned chunk of memory. I assume calls like mmap also return
page-aligned chunks. Indeed it is a problem for the current design that the
alignment must be known during compilation.
Reading a file without buffering on Windows requires that you align the
destination buffer on volume sector size boundaries, which is not known
at compile time (although many just assume 4096).
http://msdn.microsoft.com/en-**us/library/windows/desktop/**
cc644950(v=vs.85).aspx<http://msdn.microsoft.com/en-us/library/windows/desktop/cc644950(v=vs.85).aspx>
I see there's a difference between x86 and Itanium in page size...
I'm trying to lean toward handling these as rare/exotic cases without
outright eliminating them, while providing simple and efficient handling of
alignment for the frequent cases (i.e. allocate ints at addresses multiple
of 4 etc).
As I mentioned previously, you may want to also align to the cache line
size (for performance). This is not known at compile time (although
again, is often assumed in practice).
Indeed.
Andrei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/8617db20/attachment.html>
Andrei Alexandrescu
2013-09-24 05:31:19 UTC
Permalink
Post by Manu
You can't go wasting GPU memory by overallocating every block.
Only the larger chunk may need to be overallocated if all allocations
are then rounded up.
Post by Manu
It's definitely important that allocator's are able to receive an
alignment request, and give them the opportunity to fulfill a dynamic
alignment request without always resorting to an over-allocation strategy.
I'd need a bit of convincing. I'm not sure everybody needs to pay for a
few, and it is quite possible that malloc_align suffers from the same
fragmentation issues as the next guy. Also, there's always the
possibility of leaving some bits to lower-level functions.


Andrei
Manu
2013-09-24 06:06:56 UTC
Permalink
On 24 September 2013 15:31, Andrei Alexandrescu <
Post by Manu
You can't go wasting GPU memory by overallocating every block.
Only the larger chunk may need to be overallocated if all allocations are
then rounded up.
I don't follow.
If I want to allocate 4k aligned, then 8k will be allocated (because it
wants to store an offset).
Any smaller allocation let's say, 16 bytes, will round up to 4k. You can't
waste precious gpu ram like that.

A minimum and a maximum (guaranteed without over-allocating) alignment may
be useful.
But I think allocators need to be given the opportunity to do the best it
can.

It's definitely important that allocator's are able to receive an
Post by Manu
alignment request, and give them the opportunity to fulfill a dynamic
alignment request without always resorting to an over-allocation strategy.
I'd need a bit of convincing. I'm not sure everybody needs to pay for a
few, and it is quite possible that malloc_align suffers from the same
fragmentation issues as the next guy. Also, there's always the possibility
of leaving some bits to lower-level functions.
What are they paying exactly? An extra arg to allocate that can probably be
defaulted?
void[] allocate(size_t bytes, size_t align = this.alignment) shared;

Or is it the burden of adding the overallocation boilerplate logic to each
allocator for simple allocators that don't want to deal with alignment in a
conservative way?
I imagine that could possibly be automated, the boilerplate could be given
as a library.

void[] allocate(size_t size, size_t align)
{
size_t allocSize = std.allocator.getSizeCompensatingForAlignment(size,
align);

void[] mem = ...; // allocation logic using allocSize

return std.allocator.alignAllocation(mem, align); // adjusts the range,
and maybe write the offset to the prior bytes
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20130924/e8b39d26/attachment.html>
Andrei Alexandrescu
2013-09-24 15:25:11 UTC
Permalink
Post by Manu
On 24 September 2013 15:31, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org <mailto:SeeWebsiteForEmail at erdani.org>>
You can't go wasting GPU memory by overallocating every block.
Only the larger chunk may need to be overallocated if all
allocations are then rounded up.
I don't follow.
If I want to allocate 4k aligned, then 8k will be allocated (because it
wants to store an offset).
What do extant GPU allocators do here?
Post by Manu
Any smaller allocation let's say, 16 bytes, will round up to 4k. You
can't waste precious gpu ram like that.
That's easy, you just segregate allocations by size.
Post by Manu
A minimum and a maximum (guaranteed without over-allocating) alignment
may be useful.
What's the semantics of the minimum?
Post by Manu
But I think allocators need to be given the opportunity to do the best
it can.
It's definitely important that allocator's are able to receive an
alignment request, and give them the opportunity to fulfill a dynamic
alignment request without always resorting to an over-allocation strategy.
I'd need a bit of convincing. I'm not sure everybody needs to pay
for a few, and it is quite possible that malloc_align suffers from
the same fragmentation issues as the next guy. Also, there's always
the possibility of leaving some bits to lower-level functions.
What are they paying exactly? An extra arg to allocate that can probably
be defaulted?
void[] allocate(size_t bytes, size_t align = this.alignment) shared;
For allocating relatively small objects (say up to 32K), we're looking
at tens of cycles, no more. An extra argument needs to be passed around
and more importantly looked at and acted upon. At this level it's a
serious dent in the time budget.

Part of the matter is that small objects must in a way the fastest to
allocate. For larger objects, it is true to some extent that the caller
will do some work with the obtained memory, which offsets the relative
cost of allocation. (That being said, Jason Evans told me you can't
always assume the caller will do at least "a memset amount of work".)

Anyhow it stands to reason that you don't want to pay for matters
related to alignment without even looking.
Post by Manu
Or is it the burden of adding the overallocation boilerplate logic to
each allocator for simple allocators that don't want to deal with
alignment in a conservative way?
I imagine that could possibly be automated, the boilerplate could be
given as a library.
void[] allocate(size_t size, size_t align)
{
size_t allocSize =
std.allocator.getSizeCompensatingForAlignment(size, align);
void[] mem = ...; // allocation logic using allocSize
return std.allocator.alignAllocation(mem, align); // adjusts the
range, and maybe write the offset to the prior bytes
}
One possibility I'm thinking of is to make maximum alignment a static
property of the allocator. It may be set during runtime, but a given
allocator object has one define maximum allocation. Would that be
satisfactory to all?


Andrei
Nick Sabalausky
2013-09-23 23:50:50 UTC
Permalink
On Sun, 22 Sep 2013 16:49:57 -0700
Post by Andrei Alexandrescu
* expand(b, minDelta, maxDelta) grows b's length by at least minDelta
(and on a best-effort basis by at least maxDelta)
Minor bikeshedding issue, but 'maxDelta' isn't a maximum at
all by your description, rather a suggested-but-optional minimum. So it
shouldn't be called 'maxDelta' - very misleading. Should be something
like 'preferredDelta' or 'idealMinDelta', etc.
Dmitry Olshansky
2013-09-24 08:46:20 UTC
Permalink
Post by Andrei Alexandrescu
Hello,
I am making good progress on the design of std.allocator, and I am
optimistic about the way it turns out. D's introspection capabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelist allocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to a whole new level.
Several details are still in flux, but at the top level it seems most
1. Typed allocator, i.e. every request for allocation comes with the
exact type requested;
2. Untyped allocator - traffics exclusively in ubyte[].
Looks good (s/ubyte[]/void[] per current discussion).

Do you imagine Typed allocators as something more then adapters that
simplify a common pattern of allocate + emplace / destroy + deallocate?
(create!T/destroy!T)
Post by Andrei Alexandrescu
struct NullAllocator
{
enum alignment = real.alignof;
enum size_t available = 0;
ubyte[] allocate(size_t s) shared { return null; }
bool expand(ref ubyte[] b, size_t minDelta, size_t maxDelta) shared
{ assert(b is null); return false; }
bool reallocate(ref ubyte[] b, size_t) shared
{ assert(b is null); return false; }
void deallocate(ubyte[] b) shared { assert(b is null); }
void collect() shared { }
void deallocateAll() shared { }
static shared NullAllocator it;
}
First things first - can't allocator return alignment as run-time value
- a property (just like 'available' does)? The implementation contract
is that it must be O(1) vanila syscall-free function. (Read this as
consult system info exactly once, at construction).

Thinking more of it - it also may be immutable size_t? Then it gets
proper value at construction and then is never changed.
Post by Andrei Alexandrescu
* expand(b, minDelta, maxDelta) grows b's length by at least minDelta
(and on a best-effort basis by at least maxDelta) and returns true, or
realloc() are most often not; such mini-epiphanies are very exciting
because they put the design on a beam guide with few principles and many
consequences.) The precondition is that b is null or has been previously
returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame...
(+ WinAPI Heaps fits here)
Post by Andrei Alexandrescu
* deallocate(b) deallocates the memory allocated for b. b must have been
previously allocated by the same allocator. This method is usually
unsafe (but there are notable implementations that may offer safety,
such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be
manually managed? Otherwise how would one reliably work with it and not
leak? This brings us to traits that allocators may (should) have
a-la automatic? free-all on termination? Zeros on allocate (more common
then one would think)? etc.
Post by Andrei Alexandrescu
* deallocateAll() deallocates in one shot all memory previously
allocated by this allocator. This method is optional, and when present
Region allocators are notable examples of allocators that define this
method.
Okay.. I presume region "mark" works by spiting off a subinstance of
allocator and "release" by deallocateAll().
Post by Andrei Alexandrescu
* collect() frees unused memory that had been allocated with this
allocator. This optional method is implemented by tracing collectors and
This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I
guess they could be more then a simple helper.
Post by Andrei Alexandrescu
There are quite a few more things to define more precisely, but this
part of the design has become quite stable. To validate the design, I've
defined some simple allocators (Mallocator, GCAllocator, Freelist,
StackRegion, Region etc) but not the more involved ones, such as
coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poor foundation -
malloc is so out of fashion (its interface is simply too thin on
guarantees), sbrk is frankly a stone-age syscall.

I would love to make a root allocator one on top of mmap/VirtualAlloc.
This leads me to my biggest problem with classical memory management -
~20 years of PCs having virtual memory support directly in the CPU, and
yet hardly anything outside of OS takes advantage of it. They
(allocators) seem simply not aware of it - none of interface implies
anything that would help user take advantage of it. I'm talking of
(potentially) large allocations that may be often resized.

(large allocations actually almost always a blind spot/fallback in
allocators)

To the point - we have address space and optionally memory committed in
there, leading us to a 3-state of an _address range_:
available, reserved, committed. The ranges of 2nd state are dirt cheap
(abundant on 64bit) and could be easily flipped to committed and back.

So I have a pattern I want to try to fit in your design. At present it
is a datastructure + allocation logic. What I would want is clean and
nice composition for ultimate reuse.

An example is a heavy-duty array (could be used for potentially large
stack). The requirements are:
a) Never reallocate on resize - in certain cases it may even be too
large to reallocate in RAM (!)
b) Never waste RAM beyond some limit (a few pages or a tiny faction of
size is fine)
c) O(1) amortized appending as in plain array and other properties of an
array -> it's a dynamic array after all

Currently I use mmap/madvise and related entities directly. It goes as
follows.

Allocate a sufficiently large - as large as "it" may get (1Mb, 1G, 10G
you name it) _address range_. Call this size a 'capacity'.
Commit some memory (optionally on first resize) - call this a
'committed'. Appending goes using up committed space as typical array
would do. Whenever it needs more it then that applies the same extending
algorithm as usual but with (committed, capacity) pair.
Ditto on resize back - (with some amortization) it asks OS to decommit
pages decreasing the commited amount.

In short - a c++ "vector" that has 2 capacities (current and absolute
maximum) with a plus that it never reallocates. What to do if/when it
hits the upper bound - is up to specific application (terminate,
fallback to something else). It has the danger of sucking up address
space on 32bits though.. but who cares :)


Now how to fit this monster into the current API...
Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
but don't chew on, it please. In other words - allocate needs something
like a "guaranteed capacity of the block". By default it could be == size.

struct Allocator
{
...
//allocate size bytes with a block of with no less then capacity bytes.
void[] allocate(size_t size, size_t capacity=size);
...
}

Or maybe somehow add a reserve method explicitly...

Then extend then will easily do the second part of the job - in-place
resize (commit/decommit as required) and both hints make a lot of sense.

The pattern ideally would lower to: take a generic array, plug-in a
MemMap allocator, reserve top amount of memory at start ... profit!
--
Dmitry Olshansky
Andrei Alexandrescu
2013-09-24 15:56:59 UTC
Permalink
Post by Dmitry Olshansky
Looks good (s/ubyte[]/void[] per current discussion).
Changed.
Post by Dmitry Olshansky
Do you imagine Typed allocators as something more then adapters that
simplify a common pattern of allocate + emplace / destroy + deallocate?
(create!T/destroy!T)
Well as a simple matter tracing allocators will have to store dynamic
type information (just like the current GC does).

Also, I hope we'll be able to allow allocators to define Pointer(T),
Ref(T) etc. that supplant replacements for the built-in notions of
pointer, reference etc. Then, user code that uses these notions instead
of the built-in ones will be guaranteed some nice properties (e.g.
automatic reference counting). Unfortunately I don't see a way for an
allocator to enforce that its users don't do illegal things such as
escaping addresses etc. So it would be a discipline-backed scheme.
Notable beneficiaries will be containers.
Post by Dmitry Olshansky
First things first - can't allocator return alignment as run-time value
- a property (just like 'available' does)? The implementation contract
is that it must be O(1) vanila syscall-free function. (Read this as
consult system info exactly once, at construction).
It could, but as I mentioned to Manu - at this level any cost is
significant. Even changing from a compile-time constant to a global
static dynamically-initialized constant has a cost. Making alignment an
instance variable turns stateless allocators into stateful ones.
Post by Dmitry Olshansky
Thinking more of it - it also may be immutable size_t? Then it gets
proper value at construction and then is never changed.
I'm hoping to get away with a static function "static uint alignment()"
that (a) is CTFEable for most allocators/platforms, (b) uses a
dynamically-initialized static constant for a few. Then derived
allocators will inherit CTFEability.

Would a per-allocator-type alignment suffice?
Post by Dmitry Olshansky
Post by Andrei Alexandrescu
* deallocate(b) deallocates the memory allocated for b. b must have been
previously allocated by the same allocator. This method is usually
unsafe (but there are notable implementations that may offer safety,
such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" --> must be
manually managed? Otherwise how would one reliably work with it and not
leak? This brings us to traits that allocators may (should) have
a-la automatic? free-all on termination? Zeros on allocate (more common
then one would think)? etc.
The global GC does offer manual deallocation. It's the presence of
collect() that indicates tracing abilities. If deallocateAll() is
present, user code must assume it will be called during destruction.

That said, I've also been thinking of adding a variety of Boolean
telling properties. "Zeros on allocate" has an important relationship
with safety of higher-level objects - zeros offer "not really
initialized but without forged pointers so memory-safe" objects.
Post by Dmitry Olshansky
Post by Andrei Alexandrescu
* deallocateAll() deallocates in one shot all memory previously
allocated by this allocator. This method is optional, and when present
Region allocators are notable examples of allocators that define this
method.
Okay.. I presume region "mark" works by spiting off a subinstance of
allocator and "release" by deallocateAll().
Nice, didn't think of this.
Post by Dmitry Olshansky
Post by Andrei Alexandrescu
* collect() frees unused memory that had been allocated with this
allocator. This optional method is implemented by tracing collectors and
This seems hard and/or suboptimal w/o typeinfo --> typed allocators? I
guess they could be more then a simple helper.
Yah, collect() is not really appropriate at this level...
Post by Dmitry Olshansky
Post by Andrei Alexandrescu
There are quite a few more things to define more precisely, but this
part of the design has become quite stable. To validate the design, I've
defined some simple allocators (Mallocator, GCAllocator, Freelist,
StackRegion, Region etc) but not the more involved ones, such as
coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poor foundation -
malloc is so out of fashion (its interface is simply too thin on
guarantees), sbrk is frankly a stone-age syscall.
I would love to make a root allocator one on top of mmap/VirtualAlloc.
This leads me to my biggest problem with classical memory management -
~20 years of PCs having virtual memory support directly in the CPU, and
yet hardly anything outside of OS takes advantage of it. They
(allocators) seem simply not aware of it - none of interface implies
anything that would help user take advantage of it. I'm talking of
(potentially) large allocations that may be often resized.
(large allocations actually almost always a blind spot/fallback in
allocators)
To the point - we have address space and optionally memory committed in
available, reserved, committed. The ranges of 2nd state are dirt cheap
(abundant on 64bit) and could be easily flipped to committed and back.
So I have a pattern I want to try to fit in your design. At present it
is a datastructure + allocation logic. What I would want is clean and
nice composition for ultimate reuse.
[snip]

Interesting. I'll be looking forward to what you come up with. Yes, for
large, coarse-granularity allocations, tapping into OS' paging
capabilities is the best way to go.
Post by Dmitry Olshansky
Now how to fit this monster into the current API...
Plainly I need a way to tell hey Mr. Allocator RESERVE me a 1Gb of RAM
but don't chew on, it please. In other words - allocate needs something
like a "guaranteed capacity of the block". By default it could be == size.
struct Allocator
{
...
//allocate size bytes with a block of with no less then capacity bytes.
void[] allocate(size_t size, size_t capacity=size);
...
}
Or maybe somehow add a reserve method explicitly...
Assuming you have an optional two-argument allocate() primitive, once
you get the initial allocation, how do you commit it? By calling
expand() I assume? Would that be enough?

In the reserve()-based API - how would that work? Does reserve returns a
void[] with size zero and then you expand() it?


Andrei
Dan Schatzberg
2013-09-24 11:38:28 UTC
Permalink
One thing I'm not sure is addressed by this design is memory
locality. I know of libnuma http://linux.die.net/man/3/numa which
allows me to express what NUMA domain my memory should be
allocated from at run-time for each allocation.

In the case that I want to allocate memory in a specific NUMA
domain (not just local vs non-local), I believe this design is
insufficient because the number of domains are only known at
run-time.

Also, as far as alignment is concerned I will throw in that x86
is relatively unique in having a statically known cache-line
size. Both ARM and PowerPC cores can differ in their cache-line
sizes. I feel this is a significant argument for the ability to
dynamically express alignment.
deadalnix
2013-09-24 13:21:47 UTC
Permalink
On Tuesday, 24 September 2013 at 11:38:29 UTC, Dan Schatzberg
Post by Dan Schatzberg
One thing I'm not sure is addressed by this design is memory
locality. I know of libnuma http://linux.die.net/man/3/numa
which allows me to express what NUMA domain my memory should be
allocated from at run-time for each allocation.
Not being part of the interface do not mean that the allocator
cannot accept parameters. Via state for instance.
Dan Schatzberg
2013-09-24 13:52:02 UTC
Permalink
Post by deadalnix
On Tuesday, 24 September 2013 at 11:38:29 UTC, Dan Schatzberg
Post by Dan Schatzberg
One thing I'm not sure is addressed by this design is memory
locality. I know of libnuma http://linux.die.net/man/3/numa
which allows me to express what NUMA domain my memory should
be allocated from at run-time for each allocation.
Not being part of the interface do not mean that the allocator
cannot accept parameters. Via state for instance.
It is likely that I have a misunderstanding but as I understand
the proposal, I could not use the same allocator to allocate
memory with different locality domains. This seems like something
that would be desirable for the base allocator of a hierarchy.
Andrei Alexandrescu
2013-09-24 16:06:38 UTC
Permalink
One thing I'm not sure is addressed by this design is memory locality. I
know of libnuma http://linux.die.net/man/3/numa which allows me to
express what NUMA domain my memory should be allocated from at run-time
for each allocation.
In the case that I want to allocate memory in a specific NUMA domain
(not just local vs non-local), I believe this design is insufficient
because the number of domains are only known at run-time.
Also, as far as alignment is concerned I will throw in that x86 is
relatively unique in having a statically known cache-line size. Both ARM
and PowerPC cores can differ in their cache-line sizes. I feel this is a
significant argument for the ability to dynamically express alignment.
Could you send a few links so I can take a look?

My knee-jerk reaction to this is that NUMA allocators would provide
their own additional primitives and not participate naively in
compositions with other allocators.


Andrei
Loading...