Discussion:
Adding Java and C++ to the MQTT benchmarks or: How I Learned to Stop Worrying and Love the Garbage Collector
(too old to reply)
Atila Neves
2014-01-08 11:35:19 UTC
Permalink
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Paulo Pinto
2014-01-08 12:32:17 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Thanks for sharing your experience.

It goes with my experience moving enterprise server code from C++
to JVM/.NET land.

What people forget about C++ smart pointers vs
Objective-C/Rust/ParaSail ones is that without compiler support,
you just spend too much time doing the said operations.

Over the holidays I spent some time researching about the
Mesa/Cedar system developed at Xerox PARC. Cedar was already a GC
enabled systems programming language, strong typed.

Quite remarkable what the system could do as a GUI desktop
workstation in the early 80's and we are still fighting in 2014
to get GC enabled systems programming languages accepted in the
mainstream.

--
Paulo
bearophile
2014-01-08 12:35:00 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
In this file:
https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d

Instead of code:

switch(fixedHeader.type) {
case MqttType.CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case MqttType.CONNACK:


Perhaps you want code as:

final switch(fixedHeader.type) with (MqttType) {
case CONNECT:
return cereal.value!MqttConnect(fixedHeader);
case CONNACK:
...


Or even (modifying the enum):

final switch(fixedHeader.type) with (MqttType) {
case connect:
return cereal.value!MqttConnect(fixedHeader);
case connack:
...


Bye,
bearophile
Atila Neves
2014-01-08 12:47:22 UTC
Permalink
Thanks. I didn't think of using with, possibly because I've never
used it before. It's one of those cool little features that I
liked when I read about it but never remember about later.

I didn't use final switch on purpose; I normally would, but I
didn't implement all the possible MQTT message types. If I ever
do, it'll definitely be a final switch.

Atila
Post by bearophile
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
https://github.com/atilaneves/mqtt/blob/master/mqttd/factory.d
switch(fixedHeader.type) {
return cereal.value!MqttConnect(fixedHeader);
final switch(fixedHeader.type) with (MqttType) {
return cereal.value!MqttConnect(fixedHeader);
...
final switch(fixedHeader.type) with (MqttType) {
return cereal.value!MqttConnect(fixedHeader);
...
Bye,
bearophile
bearophile
2014-01-08 18:23:59 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit?

Bye,
bearophile
Atila Neves
2014-01-08 18:31:37 UTC
Permalink
I don't know if I have enough rep for it, I'd appreciate it if
someone who does posts it there.
Post by bearophile
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit?
Bye,
bearophile
Paulo Pinto
2014-01-08 18:59:46 UTC
Permalink
I don't know if I have enough rep for it, I'd appreciate it if someone
who does posts it there.
Post by bearophile
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit?
Bye,
bearophile
Done

http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/

--
Paulo
Atila Neves
2014-01-09 00:37:17 UTC
Permalink
Thanks. Not many votes though given all the downvotes. The
comments manage to be even worse than on my first blog post.

For some reason they all assume I don't know C++ even though I
know it way better than D, not to mention that they nearly all
miss the point altogether. Sigh.
Post by Paulo Pinto
I don't know if I have enough rep for it, I'd appreciate it if someone
who does posts it there.
Post by bearophile
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Going to Reddit?
Bye,
bearophile
Done
http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/
http://www.reddit.com/r/d_language/comments/1uqa4d/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/
--
Paulo
Jesse Phillips
2014-01-09 15:37:09 UTC
Permalink
Post by Atila Neves
Thanks. Not many votes though given all the downvotes. The
comments manage to be even worse than on my first blog post.
For some reason they all assume I don't know C++ even though I
know it way better than D, not to mention that they nearly all
miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and
improve your code, much like others did with the other languages
you used.
Atila Neves
2014-01-10 11:43:04 UTC
Permalink
Post by Jesse Phillips
Post by Atila Neves
Thanks. Not many votes though given all the downvotes. The
comments manage to be even worse than on my first blog post.
For some reason they all assume I don't know C++ even though I
know it way better than D, not to mention that they nearly all
miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out
and improve your code, much like others did with the other
languages you used.
I know C++. It's not that I can't finish it, it's that I can't be
bothered to. That's the whole point of the post.

Atila
Jesse Phillips
2014-01-10 16:21:11 UTC
Permalink
On Thursday, 9 January 2014 at 15:37:11 UTC, Jesse Phillips
Post by Jesse Phillips
Post by Atila Neves
Thanks. Not many votes though given all the downvotes. The
comments manage to be even worse than on my first blog post.
For some reason they all assume I don't know C++ even though
I know it way better than D, not to mention that they nearly
all miss the point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out
and improve your code, much like others did with the other
languages you used.
I know C++. It's not that I can't finish it, it's that I can't
be
bothered to. That's the whole point of the post.
Atila
I know, that doesn't mean someone can't come in and fix what they
see wrong with it. C++ programmers have less reason to prove
their language, but I think most are in denial that their
language is diffacult and that it is a problem.
Paulo Pinto
2014-01-10 16:52:06 UTC
Permalink
Post by Atila Neves
Post by Jesse Phillips
Thanks. Not many votes though given all the downvotes. The comments
manage to be even worse than on my first blog post.
For some reason they all assume I don't know C++ even though I know
it way better than D, not to mention that they nearly all miss the
point altogether. Sigh.
I wonder if someone who "knows" C++ is going to help you out and
improve your code, much like others did with the other languages you
used.
I know C++. It's not that I can't finish it, it's that I can't be
bothered to. That's the whole point of the post.
Atila
I know, that doesn't mean someone can't come in and fix what they see
wrong with it. C++ programmers have less reason to prove their language,
but I think most are in denial that their language is diffacult and that
it is a problem.
It does not help that C and C++ are currently the only portable
languages across mainstream OS vendors.

Currently I am using C++ for my Android hobby development, not because I
don't like Java, rather as it being the only common language across all
mobile SDKs.

--
Paulo
Atila Neves
2014-01-10 18:52:47 UTC
Permalink
Post by Paulo Pinto
It does not help that C and C++ are currently the only portable
languages across mainstream OS vendors.
Currently I am using C++ for my Android hobby development, not
because I don't like Java, rather as it being the only common
language across all mobile SDKs.
I feel your pain. If I were to do a cross-platform app I'd
probably do the same. At least the Android NDK has new gcc
versions to use for C++11. I assume the same is true for iOS.

Atila
Jacob Carlborg
2014-01-10 21:06:11 UTC
Permalink
I feel your pain. If I were to do a cross-platform app I'd probably do
the same. At least the Android NDK has new gcc versions to use for
C++11. I assume the same is true for iOS.
Yeah, iOS uses LLVM so that means C++11 as well.
--
/Jacob Carlborg
Atila Neves
2014-01-10 18:51:10 UTC
Permalink
Post by Jesse Phillips
Post by Atila Neves
Post by Jesse Phillips
I wonder if someone who "knows" C++ is going to help you out
and improve your code, much like others did with the other
languages you used.
I know C++. It's not that I can't finish it, it's that I can't be
bothered to. That's the whole point of the post.
Atila
I know, that doesn't mean someone can't come in and fix what
they see wrong with it. C++ programmers have less reason to
prove their language, but I think most are in denial that their
language is diffacult and that it is a problem.
Ah right, I misunderstood your what you meant. The denial is real
and I think the comments on reddit are proof of that. Who knows,
maybe I'll do it myself.

The weirdest part of it for me is that my (broken but working)
C++ implementation didn't even do badly performance-wise and
people still complained.

Atila
H. S. Teoh
2014-01-08 19:15:37 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
I have to say, this is also my experience with C++ after I learnt D.
Writing C++ is just so painful, so time-consuming, and so not rewarding
for the amount of effort you put into it, that I just can't bring myself
to write C++ anymore when I have the choice. And manual memory
management is a big part of that time sink. Which is why I believe that
a lot of the GC-phobia among the C/C++ folk is misplaced. I can
sympathise, though, because coming from a C/C++ background myself, I was
highly skeptical of GC'd languages, and didn't find it to be a
particularly appealing aspect of D when I first started learning it.

But as I learned D, I eventually got used to having the GC around, and
discovered that not only it reduced the number of memory bugs
dramatically, it also increased my productivity dramatically: I never
realized just how much time and effort it took to write code with manual
memory management: you constantly have to think about how exactly you're
going to be storing your objects, who it's going to get passed to, how
to decide who's responsible for freeing it, what's the best strategy for
deciding who allocates and who frees. These considerations permeate
every aspect of your code, because you need to know whether to
pass/return an object* to someone, and whether this pointer implies
transfer of ownership or not, since that determines who's responsible to
free it, etc.. Even with C++'s smart pointers, you still have to decide
which one to use, and what pitfalls are associated with them (beware of
cycles with refcounted pointers, passing auto_ptr to somebody might
invalidate it after they return, etc.). It's like income tax: on just
about every line of code you write, you have to pay the "memory
management tax" of extra mental overhead and time spent fixing pointer
bugs in order to not get the IRS (Invalid Reference Segfault :P)
knocking on your shell prompt.

Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that last
5% performance improvement that manual memory management *might* gives
you. That is, if you get it right. Which most C/C++ coders don't.

Case in point: recently at work I had the dubious pleasure of
encountering some C code with a particularly pathological memory
mismanagement bug. To give a bit of context: in the past, this part of
the code used to be completely manually-managed with malloc's and free's
everywhere. Just like most C code that implements business logic, it
worked well when the original people who wrote it maintained it. But
life happens, and people leave and new people come, so over time, the
code degenerated into a sad mess riddled with memory leaks and pointer
bugs everywhere. So the team lead finally put his foot down, and
replaced much of that old code with a ref-counted infrastructure. (This
being C, installing a GC was too much work; plus, GC-phobia is pretty
strong in these parts.) After all, ref-counting is the silver bullet to
cure manual memory management troubles, right? Well...

Fast-forward a couple o' years, and here I am, helping a coworker figure
out why the code was crashing. Long story short, we eventually found
that it was keeping a ref-counted container that contains two (or more)
ref-counted objects, each of which represented an async task spawned by
the parent process. The idea behind this code was to run multiple
computations on the same data, and we will use the results from whoever
finishes first. The remaining task(s) will simply be terminated. So
*somebody*, noting that we had a ref-counted system, decided to take
advantage of that fact by setting it up so that when a task finishes, it
will destroy the sub-object it's associated with, and the dtor of this
object (which will be automatically invoked by the ref-counting system)
will then walk the container and destruct every other object, which in
turn will terminate their associated tasks. Anybody spot the problem
yet? The reasoning (as far as I can reconstruct it, anyway), goes: "In
order for the dtor to destruct the remaining tasks, we just have to
decrement the refcount on the container object; since there should only
be 1 reference to it, this will cause it to dip to 0, and then the
container's dtor will take care of cleaning up all the other tasks. But
in order for the task, when it finishes, to trigger the dtor of its
associated sub-object, the refcount of the sub-object must be 1,
otherwise the dtor won't trigger and we'll get stuck. So either the
container's reference to the sub-object shouldn't be counted, or the
task's reference to the sub-object shouldn't be counted. ..." And it
just goes downhill from there.

So much for refcounting solving memory-management woes. I'm becoming
more and more convinced that most coders have no idea how to write
manual memory management code properly. Or ref-counted code, for that
matter. For all the time and effort it took to implement a ref-counting
system in *C*, no less, and the time and effort it took to fix all the
bugs associated with it, now somebody conveniently goes and subverts the
ref-counting system, and we wonder why the code isn't working? And this
isn't even performance-critical code; it's *business logic*, for crying
out loud. Sighh...

When I code in D, I discover to my pleasant surprise how much extra time
I have (and how much more spare mental capacity I have) now that I don't
have to continuously think about memory management. Sure, some of the
resulting code may not be squeezing every last drop of juice from my
CPU, but 95% of the time, it doesn't even matter anyway, 'cos it's not
even the performance bottleneck. One of the symptoms of C/C++ coders
(myself included) is that we like to write code in a funny, cramped
style that we've convinced ourselves is "optimal code". This includes
insistence on micro-managing memory allocations. However, most of this
is premature optimization, which can be readily proved by running a
profiler on your program, upon which you discover that *none* of your
meticulously-coded fine-tuned memory management code and carefully
written (aka unreadable and unmaintable) loops is even anywhere *near*
the real performance bottleneck, which turns out to be a call to
printf() that you forgot to comment out. Or a strlen() whose necessity
was forced upon you because C/C++ is still suffering from that age-old
mistake of conflating arrays with pointers. (Honestly, the necessity of
using strlen() in inconvenient places easily overshadows 99% of the
meticulously-crafted optimizations you spent 40 hours to write.)

The amount of headache (and time better spent thinking about more
important things, like how to implement an O(n log n) algorithm in place
of the current O(n^2) algorithm that will singlehandedly make *all* of
your other premature optimizations moot) saved by having a GC is almost
priceless. Unless you're writing an AAA 3D game engine. Which only 5%
of us coders have the dubious pleasure of working on. :-P

Hooray for GC's, I say.


T
--
?????? ???????? ???????, ? ??????? - ????????.
Andrei Alexandrescu
2014-01-08 19:34:50 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
http://www.reddit.com/r/programming/comments/1uqabe/adding_java_and_c_to_the_mqtt_benchmarks_or_how_i/?already_submitted=true

Andrei
Andrei Alexandrescu
2014-01-08 19:39:58 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
[snip]

You may want to paste all that as a reddit comment.

Andrei
Paulo Pinto
2014-01-08 20:22:57 UTC
Permalink
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
[snip]

Thanks very much for sharing your experience.

As I shared a few times here, it was Oberon which opened my eyes
to GC enabled systems programming languages, around 1996, maybe.

After that I was curious to learn about the other descendants of
Oberon and Modula-3. Sadly none of them got an uptake outside ETHZ
and Olivetti, except maybe for Modula-3's influence to C#.

While researching for my Oberon article, I have discovered the Cedar
programming language, developed at Xerox PARC as part of their Mesa
system.

A strong typed systems programming language with GC, as well as manual
memory management, modules and functional programming features done in
1981.

My initial though was, how would today's systems look like if Xerox had
better connections to the outside world instead of AT&T.

--
Paulo
Joseph Rushton Wakeling
2014-01-09 23:02:39 UTC
Permalink
Post by Paulo Pinto
As I shared a few times here, it was Oberon which opened my eyes
to GC enabled systems programming languages, around 1996, maybe.
What was the GC design for Oberon, and how does that relate to what's in D (and
what's in other GC'd languages)?
Paulo Pinto
2014-01-10 09:24:14 UTC
Permalink
On Thursday, 9 January 2014 at 23:02:57 UTC, Joseph Rushton
Post by Joseph Rushton Wakeling
Post by Paulo Pinto
As I shared a few times here, it was Oberon which opened my
eyes
to GC enabled systems programming languages, around 1996,
maybe.
What was the GC design for Oberon, and how does that relate to
what's in D (and what's in other GC'd languages)?
The original Oberon was a simple mark and sweep collector.
Initially implemented in Assembly. In later versions it was coded
in Oberon itself.

Original 1992/2005 edition
http://www.inf.ethz.ch/personal/wirth/ProjectOberon1992.pdf

2013 edition with images of the workstations were Oberon ran
http://www.inf.ethz.ch/personal/wirth/ProjectOberon/PO.System.pdf

EthOS used a mark and sweep GC with support for weak pointers and
finalization. Running when the system was idle or when not enough
memory was available.

http://research.microsoft.com/en-us/um/people/cszypers/books/insight-ethos.pdf

Active Oberon implementation used a mark and sweep with
finalization support.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.85.5753&rep=rep1&type=pdf

Modula-3 used a compacting GC initially, with an optional
background one.

https://modula3.elegosoft.com/cm3/doc/help/bib.html
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.6890

Cedar used a concurrent reference-counting collector, coupled
with a mark and sweep one for cycle removals, with finalization
support

http://www.textfiles.com/bitsavers/pdf/xerox/parc/techReports/CSL-84-7_On_Adding_Garbage_Collection_and_Runtime_Types_to_a_Strongly-Typed_Statically-Checked_Concurrent_Language.pdf

The features are quite similar to D:

- GC
- Allocation of data structures statically in global memory and
stack
- Escape hatches to allocate memory manually when needed

I cannot say if they also allow for interior pointers like D does.

However the main point about Oberon and other languages wasn't
only technical, but human. Funny enough that is also Andrew
Koening's latest post

http://www.drdobbs.com/cpp/social-processes-and-the-design-of-progr/240165221

The people designing such systems believed that it was possible
to write from the ground up a workstation operating system in a
GC enabled systems programming language, with minimal Assembly.

They did succeed and built workstations that were usable for
normal office work, which were then used at ETHZ, Xerox and
Olivetti for some time.

For games, some more effort would be required I do acknowledge
that.

However the world at large, ignored these efforts. As Andrew
nicely puts on his article, many times the social barrier is
higher than the technical one.

For many developers hearing the GC word, safe coding, bounds
checking is enough to make them run away as fast as they can.

--
Paulo
Benjamin Thaut
2014-01-08 20:23:48 UTC
Permalink
Post by H. S. Teoh
Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that last
5% performance improvement that manual memory management *might* gives
you. That is, if you get it right. Which most C/C++ coders don't.
The problem is, that with the current D-GC its not 5%. Its 300%. See:
http://3d.benjamin-thaut.de/?p=20
And people who are currently using C++ use C++ for a reason. And usually
this reason is performance. As long as D remains with its current GC
people will refuse to switch, given the 300% speed impact.
Additionaly programming with a GC often leads to a lot more allocations,
and programmers beeing unaware of all those allocations and the
possibility that those allocations slow down the program and might even
trash the cache. Programmers who properly learned manual memory
management are often more aware of whats happening in the background and
how to optmize algorithms for memory usage, which can lead to
astonishing performance improvements on modern hardware.

Also a GC is for automatic memory management. But memory is just a
resource. And there are a lot other resources then just memory. Having a
GC does not free you from doing other manual memory management, which
still can be annoying and can create the exact same issues as with
manual memory management. Having a large C# codebase where almost
everything implementes the IDisposeable interface doesn't really improve
the situation. It would be a lot better if GCs would focus on automatic
resource management in general, so the user is freed of all such tedious
tasks, and not just a portion of it.

Additionaly switching away from C++ is also not a option because of
other reasons. For example cross plattform compatibility. I don't know
any language other then C/C++ which would actually work on all
plattforms we (my team at work) currently develop for. Not even D
(mostly because of missing ports of druntime / phobos. Maybe even a
missing hardware architecture.)

But I fully agree, that if you do some non performance critical business
logic or application logic its a lot more productive to use a garbage
collected language. Unfortunately C# and Java are doing a far better job
then D here, mostly because of better tooling and more mature libraries.

Kind Regards
Benjamin Thaut
H. S. Teoh
2014-01-08 20:57:58 UTC
Permalink
Post by Benjamin Thaut
Post by H. S. Teoh
Manual memory management is a LOT of effort, and to be quite honest,
unless you're writing an AAA 3D game engine, you don't *need* that
last 5% performance improvement that manual memory management *might*
gives you. That is, if you get it right. Which most C/C++ coders
don't.
The problem is, that with the current D-GC its not 5%. Its 300%.
See: http://3d.benjamin-thaut.de/?p=20
Well, your experience was based on writing a 3D game engine. :) I didn't
claim that GCs are best for that scenario. How many of us write 3D game
engines for a living?
Post by Benjamin Thaut
And people who are currently using C++ use C++ for a reason. And
usually this reason is performance. As long as D remains with its
current GC people will refuse to switch, given the 300% speed
impact.
I think your view is skewed by your bad experience with doing 3D in D.
I've ported (well, more like re-written) compute-intensive code from
C/C++ to D before, and my experience has been that the D version is
either on par, or performs even better. Definitely nowhere near the 300%
slowdown you quote. (Not the mention the >50% reduction in development
time compared with writing it in C/C++!) Like I said, if you're doing
something that *needs* to squeeze every last bit of performance out of
the machine, then the GC may not be for you.

In fact, from what I hear, most people doing 3D engine work don't even
*use* memory allocation in the core engine -- everything is preallocated
so no allocation / free (not even malloc/free) is done at all. You never
know if a particular system's malloc/free relies on linear free lists,
which may cause O(n) worst-case performance -- something you definitely
want to avoid if you have only 20ms to render the next frame. If so,
then it's no wonder you see a 300% slowdown if you start using the GC
inside of the 3D engine.
Post by Benjamin Thaut
Additionaly programming with a GC often leads to a lot more
allocations, and programmers beeing unaware of all those allocations
and the possibility that those allocations slow down the program and
might even trash the cache. Programmers who properly learned manual
memory management are often more aware of whats happening in the
background and how to optmize algorithms for memory usage, which can
lead to astonishing performance improvements on modern hardware.
But the same programmers who don't know how to allocate properly on a
GC'd language will also write poorly-performing malloc/free code.
Freeing the root of a large tree structure can potentially run with no
fixed upper bound on time if the dtor recursively frees all child nodes,
so it's not that much better than a GC collection cycle. People who know
to avoid doing that will also know to write GC'd code in a way that
doesn't cause bad GC performance.
Post by Benjamin Thaut
Also a GC is for automatic memory management. But memory is just a
resource. And there are a lot other resources then just memory.
Having a GC does not free you from doing other manual memory
management, which still can be annoying and can create the exact
same issues as with manual memory management. Having a large C#
codebase where almost everything implementes the IDisposeable
interface doesn't really improve the situation. It would be a lot
better if GCs would focus on automatic resource management in
general, so the user is freed of all such tedious tasks, and not
just a portion of it.
True, but having a GC for memory is still better than having nothing at
all. Memory, after all, is the most commonly used resource, generically
speaking.
Post by Benjamin Thaut
Additionaly switching away from C++ is also not a option because of
other reasons. For example cross plattform compatibility. I don't
know any language other then C/C++ which would actually work on all
plattforms we (my team at work) currently develop for. Not even D
(mostly because of missing ports of druntime / phobos. Maybe even a
missing hardware architecture.)
That doesn't alleviate the painfulness of coding in C++.
Post by Benjamin Thaut
But I fully agree, that if you do some non performance critical
business logic or application logic its a lot more productive to use a
garbage collected language.
If you're doing performance-critical / realtime stuff, you probably want
to be very careful about how you use malloc/free anyway, same goes for
GC's.
Post by Benjamin Thaut
Unfortunately C# and Java are doing a far better job then D here,
mostly because of better tooling and more mature libraries.
[...]

I find the lack of strong metaprogramming capabilities in Java (never
tried C# before) a show-stopper for me. You have to resort to either
lots of duplicated code, or adding too many indirections that hurts
performance. For compute-intensive code, too many indirections can mean
the difference between something finishing in 2 days instead of 2 hours.


T
--
Computers are like a jungle: they have monitor lizards, rams, mice,
c-moss, binary trees... and bugs.

From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Wed Jan 8 13:03:36 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Wed, 08 Jan 2014 21:03:36 +0000
Subject: Graphics Library for D
In-Reply-To: <***@forum.dlang.org>
References: <***@invictus.skynet.com>
<lagkbm$1gju$***@digitalmars.com>
<mailman.190.1389095537.15871.digitalmars-***@puremagic.com>
<lagurr$2no8$***@digitalmars.com> <***@forum.dlang.org>
<***@invictus.skynet.com> <***@forum.dlang.org>
<***@invictus.skynet.com> <***@forum.dlang.org>
<laidmi$186r$***@digitalmars.com> <laj04g$1nqr$***@digitalmars.com>
<laj2f2$1s45$***@digitalmars.com> <laj62s$2096$***@digitalmars.com>
<***@forum.dlang.org> <***@forum.dlang.org>
<lajimj$2d00$***@digitalmars.com> <***@invictus.skynet.com>
<***@forum.dlang.org> <***@invictus.skynet.com>
<***@forum.dlang.org> <***@invictus.skynet.com>
<***@forum.dlang.org>
Message-ID: <***@forum.dlang.org>

And well the c++ guys are right when pointing to html5 canvas. It
is close enough to postscript and well worth having a look at for
those who don't know it. It is semi-immediate mode, in the sense
that it allows implementations to retain a log of draw commands.

http://www.w3.org/TR/2dcontext/
Benjamin Thaut
2014-01-08 22:23:50 UTC
Permalink
Post by H. S. Teoh
Well, your experience was based on writing a 3D game engine. :) I didn't
claim that GCs are best for that scenario. How many of us write 3D game
engines for a living?
No, this expierence is not only based of this. I observed multiple
discussions on the newsgroup, where turning off the GC would speed up
the program by factor 3. The most recent one was parsing a text file and
filling a associative array with the contents of that text file, which
is not really 3d programming. What I'm really trying to say is: I would
be willing to use a GC in D to, but only if D actually has a state of
the art GC and not some primitive old does work without language support GC.
Post by H. S. Teoh
In fact, from what I hear, most people doing 3D engine work don't even
*use* memory allocation in the core engine -- everything is preallocated
so no allocation / free (not even malloc/free) is done at all. You never
know if a particular system's malloc/free relies on linear free lists,
which may cause O(n) worst-case performance -- something you definitely
want to avoid if you have only 20ms to render the next frame. If so,
then it's no wonder you see a 300% slowdown if you start using the GC
inside of the 3D engine.
That is a common misconception you can read very often on the internet.
That doesn't make it true however. I saw lots of game and engine code in
my life already, and its far from preallocating everything. It is tried
to keep allocations to a minimum, but they are not avoided at all costs.
If its neccessary they are just done (for example when spawning a new
object, like a particle effect). It is even common to use scripting
languages like lua for some tasks in game development, and lua allocates
quite a lot during execution.
Post by H. S. Teoh
But the same programmers who don't know how to allocate properly on a
GC'd language will also write poorly-performing malloc/free code.
Freeing the root of a large tree structure can potentially run with no
fixed upper bound on time if the dtor recursively frees all child nodes,
so it's not that much better than a GC collection cycle. People who know
to avoid doing that will also know to write GC'd code in a way that
doesn't cause bad GC performance.
That is another common argument of pro GC people I have never seen in
partice yet. Meaning, I never seen a case where freeing a tree of
objects would cause a significant enough slowdown. I however saw lots of
cases where a garbage collection caused a significant slowdown.
Post by H. S. Teoh
True, but having a GC for memory is still better than having nothing at
all. Memory, after all, is the most commonly used resource, generically
speaking.
Still it only solves half the problem.
Post by H. S. Teoh
Post by Benjamin Thaut
Additionaly switching away from C++ is also not a option because of
other reasons. For example cross plattform compatibility. I don't
know any language other then C/C++ which would actually work on all
plattforms we (my team at work) currently develop for. Not even D
(mostly because of missing ports of druntime / phobos. Maybe even a
missing hardware architecture.)
That doesn't alleviate the painfulness of coding in C++.
It was never intended to. I just wanted to make the point, that even if
you want, you can't avoid C++.
Post by H. S. Teoh
Post by Benjamin Thaut
But I fully agree, that if you do some non performance critical
business logic or application logic its a lot more productive to use a
garbage collected language.
If you're doing performance-critical / realtime stuff, you probably want
to be very careful about how you use malloc/free anyway, same goes for
GC's.
This statement again has been posted hunderts of times in the GC vs
manual memory management discussion. And again I never saw that the
execution time of malloc or other self written allocators are a problem
in partice. I did however see that the runtime of a GC allocation became
a problem, to the point where it is avoided entierly. With realtime I
didn't really mean that "hard" realtime requirements of embeded systems
and alike more like "soft" realtime requirements where you want to avoid
pause times as much as possible.
Post by H. S. Teoh
I find the lack of strong metaprogramming capabilities in Java (never
tried C# before) a show-stopper for me. You have to resort to either
lots of duplicated code, or adding too many indirections that hurts
performance. For compute-intensive code, too many indirections can mean
the difference between something finishing in 2 days instead of 2 hours.
I fully agree here. Still when choosing a programming language you also
have to pick one that all programmers on the team can and want to use. I
fear that the D metaprogramming capabilities will scare of quite a few
programmers because it seems to complicated to them. (Its really the
same with C++ metaprogramming. Its syntactically ugly and verbose, but
gets the job done, and is not so complicated if you are familiar with
the most important concepts).
Joseph Rushton Wakeling
2014-01-08 22:43:26 UTC
Permalink
No, this expierence is not only based of this. I observed multiple discussions
on the newsgroup, where turning off the GC would speed up the program by factor
3.
In my experience it seems to depend very much on the particular problem being
solved and the circumstances in which memory is being allocated. Example: I
have some code where, at least in the source, dynamic arrays are being created
via "new" in a (fairly) inner loop, and this can be run repeatedly apparently
without the GC being triggered -- in fact, my suspicion is that the allocated
space is just being repeatedly re-used and overwritten, so there are no new
allocs or frees.

OTOH some other code I wrote recently had a situation where, as the data
structure in question expanded, a new array was allocated, and an old one copied
and then deallocated. This was fine up to a certain scale but above a certain
size the GC would (often but not always) kick in, leading to a significant (but
unpredictable) slowdown.

My impression was that below a certain level the GC is happy to either
over-allocate (leaving lots of space for expansion) and/or avoid freeing memory
(because there's plenty of memory still free), which avoids all the slowdown of
alloc/free until there's a significant need for it.
H. S. Teoh
2014-01-08 22:50:05 UTC
Permalink
Post by Joseph Rushton Wakeling
Post by Benjamin Thaut
No, this expierence is not only based of this. I observed multiple
discussions on the newsgroup, where turning off the GC would speed up
the program by factor 3.
In my experience it seems to depend very much on the particular
problem being solved and the circumstances in which memory is being
allocated. Example: I have some code where, at least in the source,
dynamic arrays are being created via "new" in a (fairly) inner loop,
and this can be run repeatedly apparently without the GC being
triggered -- in fact, my suspicion is that the allocated space is
just being repeatedly re-used and overwritten, so there are no new
allocs or frees.
OTOH some other code I wrote recently had a situation where, as the
data structure in question expanded, a new array was allocated, and
an old one copied and then deallocated. This was fine up to a
certain scale but above a certain size the GC would (often but not
always) kick in, leading to a significant (but unpredictable)
slowdown.
My impression was that below a certain level the GC is happy to
either over-allocate (leaving lots of space for expansion) and/or
avoid freeing memory (because there's plenty of memory still free),
which avoids all the slowdown of alloc/free until there's a
significant need for it.
So this proves that the real situation with GC vs manual memory
management isn't as simple as a binary "GC is better" or "GC is bad". It
depends a lot on the exact use case.

And now that you mention it, there does seem to be some kind of
threshold where something happens (I wasn't sure what it was before, but
now I'm thinking maybe it's a change in GC behaviour) where there's a
sudden change in program performance, that I've observed recently in one
of my programs. I might have a look into it sometime -- though I was
planning to redo that part of the code anyway, so I may or may not find
out the real reason behind this.


T
--
??????????? ?????? ???, ??? ?????? ??? ????????, ? ?? ?????? ???, ??? ????????.
H. S. Teoh
2014-01-08 23:01:29 UTC
Permalink
[...]
Post by Benjamin Thaut
Post by H. S. Teoh
I find the lack of strong metaprogramming capabilities in Java (never
tried C# before) a show-stopper for me. You have to resort to either
lots of duplicated code, or adding too many indirections that hurts
performance. For compute-intensive code, too many indirections can
mean the difference between something finishing in 2 days instead of
2 hours.
I fully agree here. Still when choosing a programming language you
also have to pick one that all programmers on the team can and want to
use. I fear that the D metaprogramming capabilities will scare of
quite a few programmers because it seems to complicated to them. (Its
really the same with C++ metaprogramming. Its syntactically ugly and
verbose, but gets the job done, and is not so complicated if you are
familiar with the most important concepts).
Coming from a C++ background, I have to say that C++ metaprogramming,
while possible, is only so in the most painful possible ways. My
impression is that C++ gave template metaprogramming a bad name, because
much of the metaprogramming aspects of templates were only discovered
after the fact, so the original design was never intended to be used in
the way it's used nowadays. As a result, people associate the design
flaws in C++ templates with template programming and metaprogramming in
general, whereas such flaws aren't an inherent feature of
metaprogramming itself.

Unfortunately, this makes people go "ewww" when they hear about D's
metaprogramming, whereas the real situation is that metaprogramming is
actually a pleasant experience in D, and very powerful if you know how
to take advantage of it.

One thing I really liked about TDPL is that Andrei sneakily introduces
metaprogramming as "compile-time parameters" early on, so that by the
time you get to the actual chapter on templates, you've already been
using them comfortably for a long time, and no longer have an irrational
fear of them.


T
--
Without geometry, life would be pointless. -- VS
Atila Neves
2014-01-09 00:46:48 UTC
Permalink
Post by Benjamin Thaut
No, this expierence is not only based of this. I observed
multiple discussions on the newsgroup, where turning off the GC
would speed up the program by factor 3. The most recent one was
The GC doesn't even show up in the profiler for this/my use case.
The one optimisation I did to avoid allocations increased
performance by all of 5%. It really depends on the use case, and
I don't think assuming a factor of 3 is advisable.
Post by Benjamin Thaut
That is another common argument of pro GC people I have never
seen in partice yet. Meaning, I never seen a case where freeing
a tree of objects would cause a significant enough slowdown. I
however saw lots of cases where a garbage collection caused a
significant slowdown.
Well, if I wasn't aware of allocation I wouldn't have done the
optimisation mentioned above, so it's a good point.

As far as slowdown happening with manual memory management, in
certain cases cleaning up reference counted smart pointers can
cause as much of a slowdown as a GC kicking in. This isn't my
opinion though, there are data to that effect. Again, it depends
on the use case.
Post by Benjamin Thaut
Still it only solves half the problem.
Maybe in Java. In D at least we have struct destructors for other
resources.
Post by Benjamin Thaut
It was never intended to. I just wanted to make the point, that
even if you want, you can't avoid C++.
A fair point. I think what we're saying is not that we won't ever
write C++ again, but that we won't write it again if given the
choice and if another language (not necessarily D) is also a good
fit.

I'd be surprised if I wasn't still writing / refactoring /
debugging C++ code a few decades for now. I don't want to write C
again ever, but I know I'll have to.
Post by Benjamin Thaut
I fully agree here. Still when choosing a programming language
you also have to pick one that all programmers on the team can
and want to use. I fear that the D metaprogramming capabilities
will scare of quite a few programmers because it seems to
complicated to them. (Its really the same with C++
metaprogramming. Its syntactically ugly and verbose, but gets
the job done, and is not so complicated if you are familiar
with the most important concepts).
I disagree wholeheartedly. It's a _lot_ more complicated in C++.
D can also do more than C++, with far saner syntax.
Walter Bright
2014-01-09 03:08:03 UTC
Permalink
Post by Benjamin Thaut
Additionaly programming with a GC often leads to a lot more allocations,
I believe that this is incorrect. Using GC leads to fewer allocations, because
you do not have to make extra copies just so it's clear who owns the allocations.

For example, if you've got an array of char* pointers, in D some can be GC
allocated, some can be malloc'd, some can be slices, some can be pointers to
string literals. In C/C++, the array has to decide on an ownership policy, and
all elements must conform.

This means extra copies.
Manu
2014-01-09 06:11:38 UTC
Permalink
Post by Walter Bright
Post by Benjamin Thaut
Additionaly programming with a GC often leads to a lot more allocations,
I believe that this is incorrect. Using GC leads to fewer allocations,
because you do not have to make extra copies just so it's clear who owns
the allocations.
You're making a keen assumption here that C programmers use STL. And no
sane programmer that I've ever worked with uses STL precisely for this
reason :P
Sadly, being conscious of eliminating unnecessary copies in C/C++ takes a
lot of work (see: time and money), so there is definitely value in
factoring that problem away, but the existing GC is broken. Until it
doesn't leak, stop the world, and/or can run incrementally, it remains no
good for realtime usage.
There were 2 presentations on improved GC's last year, why do we still have
the lamest GC imaginable? I'm still yet to hear any proposal on how this
situation will ever significantly improve...

*cough* ARC...

For example, if you've got an array of char* pointers, in D some can be GC
Post by Walter Bright
allocated, some can be malloc'd, some can be slices, some can be pointers
to string literals. In C/C++, the array has to decide on an ownership
policy, and all elements must conform.
This means extra copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140109/7fc39d42/attachment.html>
Klaim - Joël Lamotte
2014-01-09 15:01:26 UTC
Permalink
Post by Manu
You're making a keen assumption here that C programmers use STL. And no
sane programmer that I've ever worked with uses STL precisely for this
reason :P
I think this sentence is misleading. I've made high performance application
with no copy with the STL. Your "sane programmers" are just people who
don't want to learn it.
Sane programemrs make sure they know the strengh and pitfalls of their
tools. They don't avoid tools because they make incorrect assomptions, like
you are doing here.
Also, this have nothing to do with STL.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140109/7d0a6f8b/attachment-0001.html>
Paulo Pinto
2014-01-09 06:51:41 UTC
Permalink
On 9 January 2014 13:08, Walter Bright
Post by Walter Bright
Post by Benjamin Thaut
Additionaly programming with a GC often leads to a lot more
allocations,
I believe that this is incorrect. Using GC leads to fewer
allocations,
because you do not have to make extra copies just so it's
clear who owns
the allocations.
You're making a keen assumption here that C programmers use
STL. And no
sane programmer that I've ever worked with uses STL precisely
for this
reason :P
Sadly, being conscious of eliminating unnecessary copies in
C/C++ takes a
lot of work (see: time and money), so there is definitely value
in
factoring that problem away, but the existing GC is broken.
Until it
doesn't leak, stop the world, and/or can run incrementally, it
remains no
good for realtime usage.
There were 2 presentations on improved GC's last year, why do
we still have
the lamest GC imaginable? I'm still yet to hear any proposal on
how this
situation will ever significantly improve...
*cough* ARC...
For it to be done properly, RC needs to be compiler assisted,
otherwise it is just too slow.

--
Paulo
Walter Bright
2014-01-09 07:07:28 UTC
Permalink
On 9 January 2014 13:08, Walter Bright <newshound2 at digitalmars.com
Additionaly programming with a GC often leads to a lot more allocations,
I believe that this is incorrect. Using GC leads to fewer allocations,
because you do not have to make extra copies just so it's clear who owns the
allocations.
You're making a keen assumption here that C programmers use STL.
My observation has nothing to do with the STL, nor does it have anything to do
with how well the GC is implemented. Also, neither smart pointers nor ARC
resolve the excessive copying problem as I described it.

I've been coding in C for 15-20 years before the STL, and the problem of
excessive copying is a significant source of slowdown for C code.

Consider this C code:

char* cat(char* s1, char* s2) {
size_t len1 = s1 ? strlen(s1) : 0;
size_t len2 = s2 ? strlen(s2) : 0;
char* s = (char*)malloc(len1 + len2 + 1);
assert(s);
memcpy(s, s1, len1);
memcpy(s + len1, s2, len2);
s[len1 + len2] = 0;
return s;
}

Now consider D code:

string cat(string s1, string s2) {
return s1 ~ s2;
}

I can call cat with:

cat("hello", null);

and it works without copying in D, it just returns s1. In C, I gotta copy, ALWAYS.

(C's strings being 0 terminated also forces much extra copying, but that's
another topic.)

The point is, no matter how slow the GC is relative to malloc, not allocating is
faster than allocating, and a GC can greatly reduce the amount of alloc/copy
going on.

The reason that Java does excessive amounts of allocation is because Java
doesn't have value types, not because Java has a GC.
Paulo Pinto
2014-01-09 08:38:22 UTC
Permalink
Post by Walter Bright
On 9 January 2014 13:08, Walter Bright
<newshound2 at digitalmars.com
The reason that Java does excessive amounts of allocation is
because Java doesn't have value types, not because Java has a
GC.
That might change if IBM's extensions ever land in Java.

http://www.slideshare.net/rsciampacone/javaone-2013-introduction-to-packedobjects

Video presentation available here,
http://www.parleys.com/play/52504e5ee4b0a43ac121240b

Walter is right regarding D. All other GC enabled systems
programming languages do have value objects and don't require
everything to be on heap.

So the stress on the GC to clean memory is not as high as on Java
and similar systems.

--
Paulo


From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Thu Jan 9 00:40:29 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Thu, 09 Jan 2014 08:40:29 +0000
Subject: Adding Java and C++ to the MQTT benchmarks or: How I Learned to
Stop Worrying and Love the Garbage Collector
Post by Walter Bright
and it works without copying in D, it just returns s1. In C, I
gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your
own policies (invariants).
Post by Walter Bright
(C's strings being 0 terminated also forces much extra copying,
but that's another topic.)
Not if you have your own allocator and split chopped strings (you
can just overwrite the boundary character).
Post by Walter Bright
The point is, no matter how slow the GC is relative to malloc,
not allocating is faster than allocating, and a GC can greatly
reduce the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it
by embedding objects in large structs and put a variable sized
object at the end of it... Or have their own pool (possibly on
the stack at the location where it should be released).
Paulo Pinto
2014-01-09 09:10:06 UTC
Permalink
On Thursday, 9 January 2014 at 08:40:30 UTC, Ola Fosheim Gr?stad
On Thursday, 9 January 2014 at 07:07:29 UTC, Walter Bright
Post by Walter Bright
and it works without copying in D, it just returns s1. In C, I gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your
own policies (invariants).
Post by Walter Bright
(C's strings being 0 terminated also forces much extra
copying, but that's another topic.)
Not if you have your own allocator and split chopped strings
(you can just overwrite the boundary character).
Post by Walter Bright
The point is, no matter how slow the GC is relative to malloc,
not allocating is faster than allocating, and a GC can greatly
reduce the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it
by embedding objects in large structs and put a variable sized
object at the end of it... Or have their own pool (possibly on
the stack at the location where it should be released).
I have only seen those things work in small AAA class teams.
Paulo Pinto
2014-01-09 09:49:15 UTC
Permalink
On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Gr?stad
Post by Paulo Pinto
I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of
different small structs with a single malloc where it is known
that they will be freed in the same location? A compiler needs
whole program analysis to do the same.
So yes, c programs will have fewer allocs if the programmer
cared.
Yes, I did.

Not much different than memory pools in Turbo Pascal and
Objective-C for that matter.

And even more strange things, where the whole memory gets
allocated at start, then some "handles" are used with mysterious
macros to convert back and forth to real pointers.

I have also seen lots of other storage tricks that go easily out
of control when the team either grows over a certain size, or
management decides to outsource part of the development or
lowering the expected skill set of new team members.

Then you watch the older guys playing fire brigade to track down
issues of release X.Y.Z at customer site, almost every week.


--
Paulo
H. S. Teoh
2014-01-09 16:19:53 UTC
Permalink
Post by Paulo Pinto
On Thursday, 9 January 2014 at 09:38:31 UTC, Ola Fosheim Gr?stad
Post by Paulo Pinto
I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of
different small structs with a single malloc where it is known
that they will be freed in the same location? A compiler needs
whole program analysis to do the same.
So yes, c programs will have fewer allocs if the programmer cared.
Yes, I did.
Not much different than memory pools in Turbo Pascal and Objective-C
for that matter.
And even more strange things, where the whole memory gets allocated
at start, then some "handles" are used with mysterious macros to
convert back and forth to real pointers.
I have also seen lots of other storage tricks that go easily out of
control when the team either grows over a certain size, or
management decides to outsource part of the development or lowering
the expected skill set of new team members.
Then you watch the older guys playing fire brigade to track down
issues of release X.Y.Z at customer site, almost every week.
[...]

Exactly!! All these tricks are "possible" in C, but that's what they
essentially are: tricks, hacks around the language. You can only keep it
up with a small, dedicated core team. As soon as the PTBs decide to hire
new grads and move people around, you're screwed, 'cos the old guy who
was in charge of the tricky macros is no longer on the team, and nobody
else understands how the macros work, and the new guys are under
pressure to show contribution, so they barge in making assumptions about
how things work -- which usually means na?ve C semantics, lots of
strcpy's, direct pointer arithmetic, I don't use these weird macros 'cos
I don't understand what they do. Result: fire brigade. :-)

This is why compiler-enforced type attributes ultimately trumps any kind
of coding convention. It forces everyone to do the Right Thing. This is
why strings (arrays) with built-in length is better, because it allows
slicing without needing to decide whether you should copy or modify
in-place (*someone* will inevitably get it wrong).

C's superiority is keyed on the programmer being perfect -- the
philosophy of the language is to trust the programmer, to believe that
the programmer knows what he's doing. Theoretically speaking, this is a
good thing, because the compiler won't stand in your way and annoy you
when you're trying to do something clever. (This is also what made me
like C in the first place -- I was 19 at the time, so it figures. :-P)
Unfortunately, in practice, humans are fallible -- very much fallible
and error-prone -- so this philosophy only leads to pain and more pain.
With a single-person project you can still somewhat maintain some
semblance of order. But when you have a team of 15+ programmers (at my
job we have up to 50), then it's total chaos, and you start to code by
paranoia, i.e,, assume everyone else will screw up and add every
possible safeguard you can think of in your part of the code, so that
when things go wrong it's not your fault. Which means every string
modification requires copying, which means performance is out the
window. It means adding layers of indirection to shield your code from
the outside world. Which means even more pointers to work with, which in
turn means you start getting into pointer management problems, and start
needing reference counting (which, as I described in an earlier post,
people *still* screw up). At some point, you start wishing C had a GC to
clean up the mess.


T
--
Public parking: euphemism for paid parking. -- Flora
Walter Bright
2014-01-09 09:55:42 UTC
Permalink
On 1/9/2014 1:38 AM, "Ola Fosheim Gr?stad"
Post by Paulo Pinto
I have only seen those things work in small AAA class teams.
But you have probably seen c programs allocate a bunch of different small
structs with a single malloc where it is known that they will be freed in the
same location? A compiler needs whole program analysis to do the same.
So yes, c programs will have fewer allocs if the programmer cared.
A GC does not prevent such techniques.
Walter Bright
2014-01-09 17:17:53 UTC
Permalink
On 1/9/2014 3:40 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
A GC does not prevent such techniques.
No, but programmers gravitate towards less work... If alloc is transparent and
free is hidden... You gain a lot from not being explicit, but you get more
allocations overall.
GC doesn't even make those techniques harder.

I can't see any merit to the idea that GC makes for excessive allocation.
Walter Bright
2014-01-09 09:58:23 UTC
Permalink
On 1/9/2014 12:40 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
and it works without copying in D, it just returns s1. In C, I gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your own policies
(invariants).
Please explain how this can work passing both string literals and allocated
strings to cat().
Post by Walter Bright
(C's strings being 0 terminated also forces much extra copying, but that's
another topic.)
Not if you have your own allocator and split chopped strings (you can just
overwrite the boundary character).
How do you return a string that is the path part of a path/filename? (The
terminating 0 is not a problem solved by creating your own allocator.)
Walter Bright
2014-01-09 17:15:46 UTC
Permalink
On 1/9/2014 2:46 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
Please explain how this can work passing both string literals and allocated
strings to cat().
By having your own string allocator that tests for membership when you free (if
you allow free and foreign strings in your cat)?
How does that work when you pass it "hello"? allocated with malloc()? basically
any data that has mixed ancestry?

Note that your code doesn't always have control over this - you may have written
a library intended to be used by others, or you may be calling a library written
by others.
Post by Walter Bright
How do you return a string that is the path part of a path/filename? (The
terminating 0 is not a problem solved by creating your own allocator.)
If you discard the original you split at '/'.
That doesn't work if you pass a string literal, or if you are not the owner of
the data.
If you use your own
stringallocator you don't have to worry about free... You either let the garbage
remain until the pool is released or have a separate allocation structure that
allows internal splits (no private size info before first char).
That doesn't work if you're passing strings with mixed ancestry.
Walter Bright
2014-01-09 18:34:57 UTC
Permalink
On 1/9/2014 10:18 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
How does that work when you pass it "hello"? allocated with malloc()?
basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work.

BTW, it happens all the time when dealing with strings. For example, dealing
with filenames, file extensions, and paths. Components can come from the command
line, string literals, malloc, slices, etc., all mixed up together.

Overloading doesn't work because a string literal and a string allocated by
something else have the same type.
Post by Walter Bright
That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use
the old C way.
The point is more: you can make your own and make it C-compatible, and
reasonably efficient.
My point is you can't avoid making the extra copies without GC in any reasonable
way.
Paulo Pinto
2014-01-09 19:16:12 UTC
Permalink
Post by Walter Bright
On 1/9/2014 10:18 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
How does that work when you pass it "hello"? allocated with malloc()?
basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work.
BTW, it happens all the time when dealing with strings. For example,
dealing with filenames, file extensions, and paths. Components can come
from the command line, string literals, malloc, slices, etc., all mixed
up together.
Overloading doesn't work because a string literal and a string allocated
by something else have the same type.
Post by Walter Bright
That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a framework or use
the old C way.
The point is more: you can make your own and make it C-compatible, and
reasonably efficient.
My point is you can't avoid making the extra copies without GC in any
reasonable way.
Every time I see such discussions, it reminds me when I started coding
in the mid-80s and the heresy of using languages like Pascal and C
dialects for microcomputers, instead of coding everything in Assembly or
Forth.

:)

--
Paulo
H. S. Teoh
2014-01-09 19:28:56 UTC
Permalink
Post by Paulo Pinto
Post by Walter Bright
On 1/9/2014 10:18 AM, "Ola Fosheim Gr?stad"
Post by Walter Bright
How does that work when you pass it "hello"? allocated with
malloc()? basically any data that has mixed ancestry?
Why would you do that? You would have to overload cat then.
So you agree that it won't work.
BTW, it happens all the time when dealing with strings. For example,
dealing with filenames, file extensions, and paths. Components can
come from the command line, string literals, malloc, slices, etc.,
all mixed up together.
Overloading doesn't work because a string literal and a string
allocated by something else have the same type.
Post by Walter Bright
That doesn't work if you're passing strings with mixed ancestry.
Well, you have to decide if you want to roll your own, use a
framework or use the old C way.
The point is more: you can make your own and make it C-compatible,
and reasonably efficient.
My point is you can't avoid making the extra copies without GC in any
reasonable way.
Every time I see such discussions, it reminds me when I started
coding in the mid-80s and the heresy of using languages like Pascal
and C dialects for microcomputers, instead of coding everything in
Assembly or Forth.
:)
[...]

Ah, the good ole 80's. I remember I was strongly pro-assembly in those
days. Back then compiler / interpreter technology was still rather
young, and the little that I saw of it didn't leave a good impression,
so I regarded all high-level languages with suspicion. :) Especially
languages that sport "nice" string operators, since back then many
language implementations had rather na?ve string implementations, which
are really slow and inefficient.


T
--
Always remember that you are unique. Just like everybody else. -- despair.com
deadalnix
2014-01-09 22:15:17 UTC
Permalink
On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Gr?stad
What I really like about D is that the front end code appears
to be quite readable. Take a look at clang and you will see the
difference. So, I guess anyone with C++ knowledge has the
opportunity to tune both syntax and semantics to their own
liking and share it with others. That's pretty sweet (I'd like
to try that one day).
This definitively convinced me that you must be very high on
drugs.

From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Thu Jan 9 14:22:59 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Thu, 09 Jan 2014 22:22:59 +0000
Subject: Adding Java and C++ to the MQTT benchmarks or: How I Learned to
Stop Worrying and Love the Garbage Collector
This definitively convinced me that you must be very high on
drugs.
Why is that? I have browsed the repositories and had no problems
figuring out what was going on from what I read. I don't
understand all the interdependencies of course, but making small
changes should not be a big deal from what I've seen.
Paulo Pinto
2014-01-10 09:32:15 UTC
Permalink
On Thursday, 9 January 2014 at 22:02:48 UTC, Ola Fosheim Gr?stad
Post by Paulo Pinto
Every time I see such discussions, it reminds me when I
started coding in the mid-80s and the heresy of using
languages like Pascal and C dialects for microcomputers,
instead of coding everything in Assembly or Forth
If you insist on bringing up heresy...
Motorola 680xx is pretty nice compared to x86, although the
AMD64bit mode is better than it was. 680xx feels almost like C,
just better ;9, I think only MIPS is close in programmer
friendlieness. Forth is nice too, very minimalistic and quite
powerful for the simplistic implementation. I had a Forth64
module for my C64 to dabble with, a bit hard to create more
than toy programs in Forth... Postscript is pretty close
actually, and clean. But Forth is dense, so dense that you
don't edit text files, you edit text screens... But don't diss
assembly, try to get more than 8 sprites and move sprites into
the screen border without assembly, can't be done! High level
languages, my ass, BASIC can't do that!
But hey, I am not arguing in favour of Forth and C (although I
would argue in favour of 680xx and MIPS). I am arguing in
favour of smart compilers that allow you to go low level at the
top of the call stack where it matters (inner loops) without
having to resort to a different language. D is close to that,
so it is a promising candidate.
And... I actually think D is too lax in some areas. I don't
think you should be allowed to call C/C++ without nailing down
the pre/postconditions, basically describing what happens in
terms of optimization constraints. I also want the programmer
to be able to assert facts that the compiler fail to prove so
that it can be used for optimization. Basically the ability to
guide the optimizer so you don't have to resort to low level
coding. I also think giving access to malloc is a bad idea. :-P
And well, I am not new to GC, I have actually used Simula quite
a bit in classes/teaching newbies. Simula incidentally has
exactly the same Garbage Collector that D has AFAIK. I
remember we had a 1970s internal memo describing the garbage
collector of Simula on the curriculum of the compiler course...
So that is veeeery old news.
Actually Simula kinda has the same kind of string type
representation that D has too. And OO. And it has coroutines?
While it doesn't have templates, it does actually have name
parameters that has textual substitution semantics (in addition
to ref and value). Now I also kinda like that it has ":-" for
reference assignment and ":=" for value assignment, but I
didn't like it back then.
45 years later D merge Simula semantics with C (and some more
stuff). And that is an interesting thing, of course.
But hey, no point in pretending that other people don't know
what programming a GC high level language entails. If I want
low latency, I go to C/C++ and hopefully D. If I want high
level productivity I use whatever fits the bill? all GC
languages. But I don't think D should be the first option in
any non-speed area yet, so the GC is of limited use for now
IMO. (In clusters you might want that though, speed+convenience
but no need for low latency.)
I think D could pick up more good stuff from Python, like the
array closures that allows you to succinctly transform arrays.
Makes large portions of Python's standard library pointless.
What I really like about D is that the front end code appears
to be quite readable. Take a look at clang and you will see the
difference. So, I guess anyone with C++ knowledge has the
opportunity to tune both syntax and semantics to their own
liking and share it with others. That's pretty sweet (I'd like
to try that one day).
Sorry if I hit any nerve, one never knows the experience of other
people in the Internet.

It is just that in the enterprise world I have been part of
projects that ported C and C++ based servers to JVM/.NET ones,
always with comparable performance.

I do acknowledge that in game programming it might be different,
however even AAA do play with GC systems nowadays, even if they
have some issues to optimize their behavior.

For example, The Witcher 2 for the XBox 360.

http://www.makinggames.de/index.php/magazin/2155_porting_the_witcher_2_on_xbox_360

--
Paulo
H. S. Teoh
2014-01-09 19:24:50 UTC
Permalink
Post by Walter Bright
On 1/9/2014 10:18 AM, "Ola Fosheim Gr?stad"
Why would you do that? You would have to overload cat then.
So you agree that it won't work.
It will work for string literals or for malloc'ed strings, but not
for both using the same function unless you start to depend on the
data sections used for literals (memory range testing). Which is a
dirty tool-dependent hack.
Post by Walter Bright
Overloading doesn't work because a string literal and a string
allocated by something else have the same type.
Not if you return your own type, but have the same structure? You
return a struct, containing a variabled sized array of char, and
overload on that?
But I see your point regarding literal/malloc, const char* and char*
is a shady area, you can basically get anything cast to const char*.
And since it is C, people expect to pass char* and const char* around.
So most likely what will happen is that if there's any way at all to get
a char* or const char* out of your opaque struct, they will do it, and
then pass it to strcat, strlen, and who knows what else. You can't
really stop this except by convention, because the language doesn't
enforce the encapsulation, and making it truly opaque (via void* with
PIMPL) will require an extra layer of indirection and make it unusable
with commonly-expected C APIs like printf.

But we all know what happens with programming by convention when the
team grows bigger -- old people who know the Right Way of doing things
leave, and new people come in ignorant of how things are Supposed To Be,
falling back to const char*, so the code quickly degenerates into a
horrible mess of mixed conventions and memory leaks / pointer bugs
everywhere. Then you start strdup'ing everything Just In Case. Which was
Walter's original point.


T
--
By understanding a machine-oriented language, the programmer will tend to use a much more efficient method; it is much closer to reality. -- D. Knuth
Benjamin Thaut
2014-01-09 10:14:12 UTC
Permalink
Post by Walter Bright
The point is, no matter how slow the GC is relative to malloc, not
allocating is faster than allocating, and a GC can greatly reduce the
amount of alloc/copy going on.
The points should be, if D is going to stay with a GC, and if so, when
we will actually get propper GC support so a state of the art GC can be
implemented. Or if we are going to replace the GC with ARC.

This is a really important topic which shouldn't wait until the language
is 20 years old. I'm already using D since almost 3 years, and the more
I learn about Garbage Collectors and about D, the more obvious becomes
that D does not properly support garbage collection and it will require
quite some effort and spec changes to do so. And in all the time I used
D nothing changed about the garbage collector. The only thing that
happend was the RtInfo template in object.d. But it still isn't used and
only solves a small portion of the percise scanning problem. In my
opinion D was designed with language features in mind that need a GC,
but D was not designed to actually support a GC. And this needs to change.

If requested I can make a list with all language features / decisions so
far that prevent the implementation of a state of the art GC.
--
Kind Regards
Benjamin Thaut
Tobias Pankrath
2014-01-09 10:36:22 UTC
Permalink
Post by Benjamin Thaut
If requested I can make a list with all language features /
decisions so far that prevent the implementation of a state of
the art GC.
At least I am interested in your observations.
Benjamin Thaut
2014-01-09 17:50:13 UTC
Permalink
Post by Tobias Pankrath
Post by Benjamin Thaut
If requested I can make a list with all language features / decisions
so far that prevent the implementation of a state of the art GC.
At least I am interested in your observations.
Ok I will put together a list. But as I'm currently swamped with end of
semester stuff, you shouldn't expect it within the next 3 weeks. I will
post it on my blog (www.benjamin-thaut.de) and I will post it in the
"D.annouce" newsgroup.

Kind Regards
Benjamin Thaut
Paulo Pinto
2014-01-09 13:51:08 UTC
Permalink
On Thursday, 9 January 2014 at 13:44:10 UTC, Ola Fosheim Gr?stad
On Thursday, 9 January 2014 at 10:14:08 UTC, Benjamin Thaut
Post by Benjamin Thaut
If requested I can make a list with all language features /
decisions so far that prevent the implementation of a state of
the art GC.
I am also interested in this, so that I can avoid those
constructs.
I am in general in agreement with you. I think regular
ownership combined with a segmented GC that only scan pointers
to a signified GC type would not be such a big deal and could
be a real bonus. With whole program analysis you could then
reject a lot of the branches you otherwise have to follow and
you would not have to stop threads that cannot touch those GC
types. Of course, you would then avoid using generic pointers.
So, you might not need an advanced GC, just partition the GC
scan better.
Scanning stacks could be really fast if you know the call order
of stack frames (and you have that opportunity with whole
program analysis): e.g.: top frame is a(), but only b() and c()
can call a() and b() and c() have same stack frame size and
cannot hold pointers to GC object => skip over a() and b/c() in
one go.
It doesn't matter much if the GC takes even 20% of your
efficiency away, as long as it doesn't lock you down for more
than 1-2 milliseconds: that's <4 million cycles for a single
core. If you need 25 cycles per pointer you can scan <80.000
pointers per core. So if the search space can be partitioned in
a way that makes that possible by not following all pointers,
then the GC would be fine. 100.000 cache lines = 3.2MB which is
not too horrible either.
I'd rather have 1000% less efficiency in the GC by having
frequent GC calls than 400% more latency less frequently.
That could possibly be achieved with a generational parallel GC.


--
Paulo
Paulo Pinto
2014-01-09 14:40:15 UTC
Permalink
On Thursday, 9 January 2014 at 14:19:41 UTC, Ola Fosheim Gr?stad
Post by Paulo Pinto
That could possibly be achieved with a generational parallel
GC.
Isn't the basic assumption in a generational GC that most
free'd objects has a short life span and happened since the
last collection? Was there some assumption about the majority
of inter-object pointers being within the same generation, too?
So that you partition the objects in "train carts" and only
have few pointers going between carts? I haven't looked at the
original paper in a long time...
That was just a suggestion. There are plenty of incremental GC
algorithms to choose from.
Anyway, if that is the assumption then it is generally not true
for programs that are written for real time. Temporary objects
are then allocated in pools or on the stack. Objects that are
free'd tend to come from timers, events or because they have a
lifespan (like enemies in a computer game).
There are real time GCs controlling missile tracking systems.

Personally I find them a bit more real time than computer games.

On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.
I also dislike the idea of the GC locking cores down when it
doesn't have to, so I don't think parallel is particularly
useful. It will just put more pressure on the memory bus. I
think it is sufficient to have a simple GC that only scans
disjoint subsets (for that kind of application), so yes
partitioned by type, or better: by reachability, but not by
generation.
If the GC behaviour is predictable then the application can be
designed to not trigger bad behaviour from the get go.
Sure, the GC usage should not hinder the application's
performance.

However, unless you target systems without an OS, you'll have
anyway the OS making whatever it wants with the existing cores.

I never saw much control besides setting affinities.

--
Paulo
Paulo Pinto
2014-01-09 15:11:52 UTC
Permalink
On Thursday, 9 January 2014 at 14:57:31 UTC, Ola Fosheim Gr?stad
Post by Benjamin Thaut
Post by Paulo Pinto
On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.
You have GC in games, but you limit it to a small set of
objects (<50000?)
So you can have real time with GC with an upper-bound.
Putting everything under GC is probably not a future proof
concept, since memory capacity most likely will increase faster
than CPU speed for technical reasons.
Sure. As I mentioned in another thread, the other GC enabled
system programming languages I know, also allow for static,
global and stack allocation.

And you also have an escape hatch to do manual memory management
if you really have to.

Namely Oberon(-2), Component Pascal, Active Oberon, Modula-3,
Sing# and Cedar. Just in case you feel like looking any of them
up.

While those ended up never being adopted by the industry at
large, we can draw lessons from the experience of their users.
Positive features and related flaws.

Currently I am digging up the Mesa/Cedar reports from Xerox PARC.

I think D already has the necessary features, their performance
just needs to be improved.

--
Paulo
Benjamin Thaut
2014-01-09 14:51:53 UTC
Permalink
Am 09.01.2014 15:28, schrieb "Ola Fosheim Gr?stad"
And, if it isn't in D already I would very much like to have a weak
pointer type that will be set to null if the object is only pointed to
by weak pointers.
It is a PITA to have objects die and get them out of a bunch of
event-queues etc.
Didn't phobos get such a weak pointer type lately? I at least saw a
implementation on the newsgroup very recently.

It used core.memory.setAttr to store information in objects. Then you
can overwrite the collectHandler in core.runtime to null the weak
references up destruction.
--
Kind Regards
Benjamin Thaut

From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Thu Jan 9 06:57:30 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Thu, 09 Jan 2014 14:57:30 +0000
Subject: Adding Java and C++ to the MQTT benchmarks or: How I Learned to
Stop Worrying and Love the Garbage Collector
On a game you might miss a few rendering frames, a GC induced
delay on a missile tracking system might turn out a bit ugly.
You have GC in games, but you limit it to a small set of objects
(<50000?)
So you can have real time with GC with an upper-bound.

Putting everything under GC is probably not a future proof
concept, since memory capacity most likely will increase faster
than CPU speed for technical reasons.
However, unless you target systems without an OS, you'll have
anyway the OS making whatever it wants with the existing cores.
Yes, but you don't blame the application if the scheduler isn't
real time friendly. Linux has been a been kind of bad, because
distributions have been focused on servers. But you find real
time friendly schedulers too.
H. S. Teoh
2014-01-09 16:02:02 UTC
Permalink
Post by Walter Bright
and it works without copying in D, it just returns s1. In C, I gotta copy, ALWAYS.
Only if you write libraries, in an application you can set your own
policies (invariants).
Yes, programming by convention, which falls flat as soon as you have a
large team on the project, and people don't know your conventions
(you'll be surprised how many "seasoned" programmers will just walk all
over your code writing what they're used to writing, with no thought to
read the code first and figure out how their code might fit in with the
rest). I see lots of this at my job, and it inevitably leads to
problems, because in C, people just *expect* the usual copying
conventions. Sure, if you're a one-man project, then you can remove some
of this copying, but rest assured that in a team project things will go
haywire, and inevitably you'll end up dictating that everyone must copy
everything because that's the only way to guarantee module X, which is
written by team B, doesn't do something screwy with our data.
Post by Walter Bright
(C's strings being 0 terminated also forces much extra copying,
but that's another topic.)
Not if you have your own allocator and split chopped strings (you
can just overwrite the boundary character).
You can't do this if the caller still wishes to retain the original
string.
Post by Walter Bright
The point is, no matter how slow the GC is relative to malloc, not
allocating is faster than allocating, and a GC can greatly reduce
the amount of alloc/copy going on.
But since malloc/free is tedious c-programmers tend to avoid it by
embedding objects in large structs and put a variable sized object
at the end of it... Or have their own pool (possibly on the stack at
the location where it should be released).
[...]

One thing I miss in D is a nice way to allocate structs with a
variable-length "static" array at the end. GCC supports this, probably
as an extension (I don't remember if the C standard specifies this). I
know I can just manually allocate this via core.gc and casts, but a
built-in solution would be really nice.


T
--
Sometimes the best solution to morale problems is just to fire all of the unhappy people. -- despair.com
bearophile
2014-01-09 16:12:45 UTC
Permalink
Post by H. S. Teoh
One thing I miss in D is a nice way to allocate structs with a
variable-length "static" array at the end. GCC supports this,
probably as an extension (I don't remember if the C standard
specifies this). I know I can just manually allocate this via
core.gc and casts, but a built-in solution would be really nice.
Since dmd 2.065 D supports this very well (it was supported in
past too, but a less well). See:
http://rosettacode.org/wiki/Sokoban#Faster_Version

Bye,
bearophile
NoUseForAName
2014-01-08 23:08:41 UTC
Permalink
Post by H. S. Teoh
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
Manual memory management is a LOT of effort
Not in my experience. It only gets ugly if you attempt to write
Ruby/Java in C/C++. In C/C++ you do not wildly create short-lived
objects all over the place. In embedded C there is often no
object allocation at all after initialization. I have written C
and C++ code for 15 years and the only real issue was memory
safety but you do not need a GC to solve that problem.
Post by H. S. Teoh
unless you're writing an AAA 3D game engine, you don't *need*
that last
5% performance improvement that manual memory management
*might* gives
you.
The performance issues of GC are not measured in percentages but
in pause times. Those become problematic when - for example -
your software must achieve a frame rate of at least 60 frames per
second - every second. In future this will get worse because it
seems the trend goes towards 120 Hz screens which require a frame
rate of at least 120 frames per second for the best experience.
Try squeezing D's stop-the-world GC pause times in there.

The D solution is to avoid the GC and fallback to C-style code.
That is why Rust creates so much more excitement among C/C++
programmers. You get high-level code, memory safety AND no pause
times.
NoUseForAName
2014-01-08 23:43:42 UTC
Permalink
On Wednesday, 8 January 2014 at 23:27:39 UTC, Ola Fosheim Gr?stad
let mut x = 4.
Whyyy would anyone want to create such a syntax? I really want
to like Rust, but I... just...
Looks pretty boring/conventional to me. If you know many
programming languages you immediately recognize "let" as a common
keyword for assignment. That keyword is older than me and I am
old (by Silicon Valley standards).

That leaves only the funny sounding "mut" as slightly unusual. It
is the result of making immutable the default which I think is a
good decision.

It is horribly abbreviated but the vast majority of programmers
who know what a cache miss is seem to prefer such abbreviations
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin". See?
Rust makes C/C++ damaged people feel right at home even there ;P

From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Wed Jan 8 15:45:45 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Wed, 08 Jan 2014 23:45:45 +0000
Subject: Graphics Library for D
The fork on SourceForge, although considered maintained, it
contains only a few small changes. Right now the revision
number of that repo is only about 90, and there isn't much
happening in the repo over the years. I think if we pick up the
Sadly, the author apparently died in november:

http://www.microsofttranslator.com/bv.aspx?from=ru&to=en&a=http://rsdn.ru/forum/life/5377743.flat
H. S. Teoh
2014-01-09 00:49:55 UTC
Permalink
[...]
Post by NoUseForAName
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin". See?
The absolute worst offender from the C days was creat(). I mean,
seriously?? I'm actually a fan of abbreviated names myself, but that one
simply takes it to a whole 'nother level of wrong.
I don't mind cout, I hardly use cin, I try to avoid cerr, and I've
never used clog? I mind how you configure iostreams though. It looks
worse than printf, not sure how they managed that.
[...]

I hate iostream with a passion. The syntax is only the tip of the
proverbial iceberg. Manipulators that change the global state of the
output stream, pathologically verbose ways of controlling output format
(cout << setprecision(5) << num; -- really?!) that *also* modifies
global state, crazy choice of output operator with counter-intuitive
operator precedence (cout << a&b doesn't do what you think it does), ...
I have trouble finding what's there to like about iostream.

Even when I was still writing C++ a few years ago, I avoided iostream
like the plague. For all of its flaws, C's stdio is still far better
than iostream in terms of everyday usability. At least for me. YMMV.


T
--
Marketing: the art of convincing people to pay for what they didn't need before which you can't deliver after.
H. S. Teoh
2014-01-09 01:24:54 UTC
Permalink
Post by H. S. Teoh
The absolute worst offender from the C days was creat().
That's unfair, that's unix, not C!
http://linux.die.net/man/3/explain_creat_or_die
That's why I said "from the C days", not "in C". :) Remember that C was
created... um, creat-ed... in order to write Unix.


T
--
Gone Chopin. Bach in a minuet.

From Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> Wed Jan 8 17:37:07 2014
From: Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com> (Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<ola.fosheim.grostad+dlang at gmail.com>)
Date: Thu, 09 Jan 2014 01:37:07 +0000
Subject: Graphics Library for D
In-Reply-To: <***@forum.dlang.org>
References: <***@invictus.skynet.com>
<***@forum.dlang.org> <***@invictus.skynet.com>
<lak60c$14s$***@digitalmars.com> <***@invictus.skynet.com>
<***@forum.dlang.org> <***@invictus.skynet.com>
<***@forum.dlang.org> <***@forum.dlang.org>
Message-ID: <***@forum.dlang.org>

On Thursday, 9 January 2014 at 00:20:00 UTC, Ola Fosheim Gr?stad
Basically you partition edges into sub-pixel polygons and sort
them. Then you calculate visibility and coverage before shading.
(note: this is not how REYES work for 3D, but I think it could be
adapted with good results for non-realtime 2D this way.)
Brad Anderson
2014-01-09 05:41:35 UTC
Permalink
On Thursday, 9 January 2014 at 01:06:03 UTC, Ola Fosheim Gr?stad
Post by H. S. Teoh
The absolute worst offender from the C days was creat().
That's unfair, that's unix, not C!
http://linux.die.net/man/3/explain_creat_or_die
But that just means the same people are responsible.
Paulo Pinto
2014-01-09 06:46:57 UTC
Permalink
On Wed, Jan 08, 2014 at 11:59:58PM +0000,
On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName
[...]
Post by NoUseForAName
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin".
See?
The absolute worst offender from the C days was creat(). I mean,
seriously?? I'm actually a fan of abbreviated names myself, but
that one
simply takes it to a whole 'nother level of wrong.
I don't mind cout, I hardly use cin, I try to avoid cerr, and
I've
never used clog? I mind how you configure iostreams though. It
looks
worse than printf, not sure how they managed that.
[...]
I hate iostream with a passion.
I am on the other side of the fence, enjoying iostream since
1994. :)

--
Paulo
Paulo Pinto
2014-01-09 06:49:13 UTC
Permalink
On Wednesday, 8 January 2014 at 23:59:59 UTC, Ola Fosheim Gr?stad
On Wednesday, 8 January 2014 at 23:43:43 UTC, NoUseForAName
Post by NoUseForAName
Looks pretty boring/conventional to me. If you know many
programming languages you immediately recognize "let" as a
common keyword for assignment.
Yes, but I cannot think of a single one of them that I would
like to use! ;-)
Post by NoUseForAName
That leaves only the funny sounding "mut" as slightly unusual.
It is the result of making immutable the default which I think
is a good decision.
Agree on the last point, immutable should be the default.
Altough I think they should have skipped both "let" and "mut"
and used a different symbol for initial-assignment instead.
Post by NoUseForAName
(I am not part of that majority, though). I mean C gave us
classics like "atoi".. still reminds me of "ahoi" every time I
read it. And I will never get over C++'s "cout" and "cin". See?
I don't mind cout, I hardly use cin, I try to avoid cerr, and
I've never used clog? I mind how you configure iostreams
though. It looks worse than printf, not sure how they managed
that.
Post by NoUseForAName
Rust makes C/C++ damaged people feel right at home even there
;P
Well, I associate "let" with the functional-toy-languages we
created/used at the university in the 90s so I kind of have
problem taking Rust seriously. And the name? RUST? Decaying
metal. Why? It gives me the eerie feeling that the designers
are either brilliant, mad or both, or that it is a practical
joke. I'm sure the compiler randomly tells you Aprils Fools! Or
something.
You mean the toy languages that are slowly replacing C++ in the
finance industry?
Sean Kelly
2014-01-09 19:01:59 UTC
Permalink
Post by H. S. Teoh
Post by Atila Neves
http://atilanevesoncode.wordpress.com/2014/01/08/adding-java-and-c-to-the-mqtt-benchmarks-or-how-i-learned-to-stop-worrying-and-love-the-garbage-collector/
I have to say, this is also my experience with C++ after I
learnt D.
Writing C++ is just so painful, so time-consuming, and so not
rewarding
for the amount of effort you put into it, that I just can't
bring myself
to write C++ anymore when I have the choice. And manual memory
management is a big part of that time sink. Which is why I
believe that
a lot of the GC-phobia among the C/C++ folk is misplaced. I can
sympathise, though, because coming from a C/C++ background
myself, I was
highly skeptical of GC'd languages, and didn't find it to be a
particularly appealing aspect of D when I first started
learning it.
But as I learned D, I eventually got used to having the GC
around, and
discovered that not only it reduced the number of memory bugs
dramatically, it also increased my productivity dramatically: I
never realized just how much time and effort it took to write
code with manual memory management: you constantly have to
think about how exactly you're going to be storing your
objects, who it's going to get passed to, how to decide who's
responsible for freeing it, what's the best strategy for
deciding who allocates and who frees. These considerations
permeate every aspect of your code, because you need to know
whether to
pass/return an object* to someone, and whether this pointer
implies
transfer of ownership or not, since that determines who's
responsible to free it, etc.. Even with C++'s smart pointers,
you still have to decide which one to use, and what pitfalls
are associated with them (beware of cycles with refcounted
pointers, passing auto_ptr to somebody might invalidate it
after they return, etc.). It's like income tax: on just about
every line of code you write, you have to pay the "memory
management tax" of extra mental overhead and time spent fixing
pointer bugs in order to not get the IRS (Invalid Reference
Segfault :P)
knocking on your shell prompt.
This is what initially drew me to D from C++. Having a GC is a
huge productivity gain.
Post by H. S. Teoh
Manual memory management is a LOT of effort, and to be quite
honest, unless you're writing an AAA 3D game engine, you don't
*need* that last 5% performance improvement that manual memory
management *might* gives you. That is, if you get it right.
Which most C/C++ coders don't.
The other common case is server apps, since unpredictable delays
can be quite undesirable as well. Java seems to mostly get
around this by having very mature and capable GCs despite having
a standard library that wants you to churn through memory like
pies at an eating contest. The best you can do with D so far is
mostly to just not allocate whenever possible, by slicing strings
and such, since scanning can still be costly. I think there's
still some work to do here, despite loving the GC as a general
feature.
H. S. Teoh
2014-01-09 19:40:13 UTC
Permalink
[...]
Post by Sean Kelly
Post by H. S. Teoh
Manual memory management is a LOT of effort, and to be quite
honest, unless you're writing an AAA 3D game engine, you don't
*need* that last 5% performance improvement that manual memory
management *might* gives you. That is, if you get it right. Which
most C/C++ coders don't.
The other common case is server apps, since unpredictable delays
can be quite undesirable as well. Java seems to mostly get
around this by having very mature and capable GCs despite having
a standard library that wants you to churn through memory like
pies at an eating contest. The best you can do with D so far is
mostly to just not allocate whenever possible, by slicing strings
and such, since scanning can still be costly. I think there's
still some work to do here, despite loving the GC as a general
feature.
I think we all agree that D's GC in its current state needs a lot of
improvement. While I have come to accept GCs as a good thing, that
doesn't mean that D's current GC is *that* good. Yet. I wish I had the
know-how (and the time!) to improve D's GC, because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, since the
existence of value types greatly reduces the memory pressure on the GC,
so the GC will have much less work to do compared to an equivalent Java
program.

OTOH, even with D's suboptimal GC, I'm already seeing great productivity
gains at only a low cost, so that's a big thumbs up for GC's. And the
nice thing about being able to call malloc from D (which you can't in
Java) means you can still do manual memory management in critical code
sections when you need to squeeze out some extra performance.


T
--
Turning your clock 15 minutes ahead won't cure lateness---you're just making time go faster!
Paulo Pinto
2014-01-09 19:52:53 UTC
Permalink
Post by H. S. Teoh
[...]
Post by Sean Kelly
Post by H. S. Teoh
Manual memory management is a LOT of effort, and to be quite
honest, unless you're writing an AAA 3D game engine, you don't
*need* that last 5% performance improvement that manual memory
management *might* gives you. That is, if you get it right. Which
most C/C++ coders don't.
The other common case is server apps, since unpredictable delays
can be quite undesirable as well. Java seems to mostly get
around this by having very mature and capable GCs despite having
a standard library that wants you to churn through memory like
pies at an eating contest. The best you can do with D so far is
mostly to just not allocate whenever possible, by slicing strings
and such, since scanning can still be costly. I think there's
still some work to do here, despite loving the GC as a general
feature.
I think we all agree that D's GC in its current state needs a lot of
improvement. While I have come to accept GCs as a good thing, that
doesn't mean that D's current GC is *that* good. Yet. I wish I had the
know-how (and the time!) to improve D's GC, because if D can get a GC
that's on par with Java's, then D can totally beat Java flat, since the
existence of value types greatly reduces the memory pressure on the GC,
so the GC will have much less work to do compared to an equivalent Java
program.
OTOH, even with D's suboptimal GC, I'm already seeing great productivity
gains at only a low cost, so that's a big thumbs up for GC's. And the
nice thing about being able to call malloc from D (which you can't in
Java) means you can still do manual memory management in critical code
sections when you need to squeeze out some extra performance.
T
Well, there are a few options to call malloc from Java:

- Do you own JNI wrapper
- Use Java Native Access
- Use Java Native Runtime
- Use NIO Buffers
- Use sun.misc.Unsafe.allocateMemory (sun.misc.Unsafe is planned to
become a public API)

--
Paulo
qznc
2014-01-09 21:35:44 UTC
Permalink
Post by H. S. Teoh
because if D can get a GC
that's on par with Java's, then D can totally beat Java flat,
since the
existence of value types greatly reduces the memory pressure on
the GC,
so the GC will have much less work to do compared to an
equivalent Java
program.
Java will probably gain (something like) value types at some
point. Google for "packed objects", it provides similar gains as
value types.

Hopefully, D gets a better GC first.
Brian Rogoff
2014-01-09 22:51:22 UTC
Permalink
Post by qznc
Post by H. S. Teoh
because if D can get a GC
that's on par with Java's, then D can totally beat Java flat,
since the
existence of value types greatly reduces the memory pressure
on the GC,
so the GC will have much less work to do compared to an
equivalent Java
program.
Java will probably gain (something like) value types at some
point. Google for "packed objects", it provides similar gains
as value types.
Hopefully, D gets a better GC first.
What's the status of all that? There were interesting talks at
DConf 2013 about precise and concurrent GCs, and it seemed that
work was going on to fold all that into the compilers, and that
Walter/Andrei were ready to make changes to the spec and runtime
if needed to support precise GC. All very encouraging.

Will DMD have a precise GC by the next DConf?

-- Brian
H. S. Teoh
2014-01-09 23:29:09 UTC
Permalink
Post by qznc
because if D can get a GC that's on par with Java's, then D can
totally beat Java flat, since the existence of value types greatly
reduces the memory pressure on the GC, so the GC will have much less
work to do compared to an equivalent Java program.
Java will probably gain (something like) value types at some
point. Google for "packed objects", it provides similar gains as
value types.
Hopefully, D gets a better GC first.
What's the status of all that? There were interesting talks at DConf
2013 about precise and concurrent GCs, and it seemed that work was
going on to fold all that into the compilers, and that Walter/Andrei
were ready to make changes to the spec and runtime if needed to
support precise GC. All very encouraging.
Will DMD have a precise GC by the next DConf?
[...]

Has *anything* been done on the GC at all since the previous DConf? Not
trying to be provocative, just genuinely curious if anything has been
happening on that front, since I don't remember seeing any commits in
that area all year.


T
--
"I'm running Windows '98." "Yes." "My computer isn't working now." "Yes, you already said that." -- User-Friendly
Walter Bright
2014-01-10 02:08:50 UTC
Permalink
Post by H. S. Teoh
Has *anything* been done on the GC at all since the previous DConf? Not
trying to be provocative, just genuinely curious if anything has been
happening on that front, since I don't remember seeing any commits in
that area all year.
Not much.
Continue reading on narkive:
Loading...