Note: this post was written in early March and supposed to be published then. It was extended with further observations after the closing words to make up for this unfortunate oversight.
Managing resources is a solved problem
In my tsess introduction I rambled
on Hare and how it fares against C, Rust and Go. The answer was pretty well
and tsess-0.2
is good occasion to revisit this.
As the project probably already covers 120% of my needs, there will be little to add besides small features I have in mind that only add convenience to a utility I already find more than good enough on a daily basis. These features will be a good excuse to explore a little more of Hare. That excuse has been the main driver for the initial attempt to write tsess in Rust and the current success in Hare.
As stated previously
Valgrind found leaks in tsess
so safety in Hare is already a failure. After
all, Rust solved this with its ownership model enforced by its borrow checker.
But the same borrow checker reached its limit when I tried to fit a single
allocation model for the configuration file with parsing in place. This was a
different kind of failure, either on Rust for not being able to get a &&str
from a Box<str>
or on me for not figuring out how.
Needless to say, I wouldn’t have this problem in C (although I would have a different problem called null-terminated strings) and I didn’t have this problem with Hare. But nevertheless, I leaked resources.
Revisiting tsess leaks
Since I’m too lazy to either go back in Git history before the commits that plugged leaks, restoring the tool chain I was using then, I will simply revert the bug fixes to revisit the leaks. Fortunately, so far, I made leaks easy to find from commit messages alone.
$ git revert --no-commit --no-edit $(git log --grep Plug --format=%H)
Auto-merging bin/tsess/config/env.ha
Interesting fact, so far I only fixed leaks in env.ha
, and what’s is not
visible is that two leaks were plugged.
Now, let’s ask Valgrind about problems, but let’s do it as it used to, from a
statically linked tsess
.
After running the following command:
valgrind --tool=memcheck --leak-check=full --track-fds=all ./tsess list
I get the following report from Valgrind, after cleaning it up:
FILE DESCRIPTORS: 4 open (3 std) at exit.
Open file descriptor 3: /home/dridi/.config/tsess/example.tsess
at 0x8006BB1: ??? (in /home/dridi/src/tsess/tsess)
by 0x8003721: ??? (in /home/dridi/src/tsess/tsess)
by 0x80166FD: ??? (in /home/dridi/src/tsess/tsess)
by 0x801657C: ??? (in /home/dridi/src/tsess/tsess)
by 0x8011C8E: ??? (in /home/dridi/src/tsess/tsess)
by 0x8012ED9: ??? (in /home/dridi/src/tsess/tsess)
by 0x805421A: ??? (in /home/dridi/src/tsess/tsess)
by 0x8053928: ??? (in /home/dridi/src/tsess/tsess)
by 0x80531E2: ??? (in /home/dridi/src/tsess/tsess)
by 0x80507EA: ??? (in /home/dridi/src/tsess/tsess)
by 0x806241E: ??? (in /home/dridi/src/tsess/tsess)
Open file descriptor 2: /dev/pts/6
<inherited from parent>
Open file descriptor 1: /dev/pts/6
<inherited from parent>
Open file descriptor 0: /dev/pts/6
<inherited from parent>
HEAP SUMMARY:
in use at exit: 0 bytes in 0 blocks
total heap usage: 0 allocs, 0 frees, 0 bytes allocated
All heap blocks were freed -- no leaks are possible
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Despite the lack of a useful stack trace, it was easy to find the source of the leak since there is only one place in the code that opens files. The bug fix is a one-liner:
$ git show 8d525d563f13c3ce995471e23fa1078bb90fe8e9
commit 8d525d563f13c3ce995471e23fa1078bb90fe8e9
config: Plug file leak
Spotted by Valgrind.
diff --git a/bin/tsess/config/env.ha b/bin/tsess/config/env.ha
index cf21695..7449318 100644
--- a/bin/tsess/config/env.ha
+++ b/bin/tsess/config/env.ha
@@ -74,6 +74,7 @@ fn file_size(h: io::handle) (size | io::error) = {
fn file_load(path: str) ([]u8 | error) = {
let file = os::open(path)?;
+ defer io::close(file)!;
if (file_size(file)? == 0) {
return errors::unsupported: io::error;
};
With the only file leak plugged, it’s time to look at memory leaks. I know for
a fact that there are more than zero allocations, and I also know for a fact
that Valgrind could find them if tsess
was dynamically linked.
Memory leaks
Since version 0.2, tsess
is dynamically linked by default. It is however
still possible to statically link it at build time:
$ make tsess_LDFLAGS=
CCLD gen_cmd
GEN bin/tsess/cli/gen/cmd.ha
CCLD gen_env
GEN bin/tsess/config/gen/env_vars.ha
CCLD gen_prop
GEN bin/tsess/config/gen/prop_defs.ha
CCLD tsess
GEN man/tsess.1
GEN man/tsess.5
$ file * | grep ELF
gen_cmd: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, with debug_info, not stripped
gen_env: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, with debug_info, not stripped
gen_prop: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, with debug_info, not stripped
tsess: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, with debug_info, not stripped
Side note, the invocation eventually changed to make tsess_LIBS=
.
My rationale to dynamically link by default is that libc may provide certain
services based on the system configuration. For example, a system user or
group lookup could be done from /etc/passwd
or /etc/group
, or from an NSS
module.
This can only work if the Hare runtime is implemented with fall-backs to libc
functions. There is at least one service from libc that will be used, and it’s
the memory allocator. This probably means that we can also use a different
malloc
implementation, for example linking to jemalloc
for its profiling
capabilities among other things, but I digress.
So, once I dynamically link tsess
, let’s look at memory leaks. After running
the same Valgrind command I get the following suggestion:
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-leak-kinds=all
Since I was able to confirm that the file leak was plugged, I ran this command instead:
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all ./tsess list
And I get the following report, once again cleaned up to keep the essentials:
HEAP SUMMARY:
in use at exit: 16,472 bytes in 5 blocks
total heap usage: 102 allocs, 97 frees, 62,379 bytes allocated
12 bytes in 1 blocks are indirectly lost in loss record 1 of 5
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: ??? (in /home/dridi/src/tsess/tsess)
by 0x41844D: ??? (in /home/dridi/src/tsess/tsess)
by 0x42B9C6: ??? (in /home/dridi/src/tsess/tsess)
by 0x42B3DA: ??? (in /home/dridi/src/tsess/tsess)
by 0x430182: ??? (in /home/dridi/src/tsess/tsess)
by 0x47F907: ??? (in /home/dridi/src/tsess/tsess)
by 0x47D2D5: ??? (in /home/dridi/src/tsess/tsess)
by 0x48F631: ??? (in /home/dridi/src/tsess/tsess)
by 0x408738: ??? (in /home/dridi/src/tsess/tsess)
by 0x4897149: (below main) (libc_start_call_main.h:58)
19 bytes in 1 blocks are still reachable in loss record 2 of 5
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: ??? (in /home/dridi/src/tsess/tsess)
by 0x418A85: ??? (in /home/dridi/src/tsess/tsess)
by 0x4824E6: ??? (in /home/dridi/src/tsess/tsess)
by 0x482667: ??? (in /home/dridi/src/tsess/tsess)
by 0x482794: ??? (in /home/dridi/src/tsess/tsess)
by 0x408D51: ??? (in /home/dridi/src/tsess/tsess)
by 0x408733: ??? (in /home/dridi/src/tsess/tsess)
by 0x4897149: (below main) (libc_start_call_main.h:58)
25 bytes in 1 blocks are still reachable in loss record 3 of 5
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: ??? (in /home/dridi/src/tsess/tsess)
by 0x418A85: ??? (in /home/dridi/src/tsess/tsess)
by 0x482A67: ??? (in /home/dridi/src/tsess/tsess)
by 0x408D51: ??? (in /home/dridi/src/tsess/tsess)
by 0x408733: ??? (in /home/dridi/src/tsess/tsess)
by 0x4897149: (below main) (libc_start_call_main.h:58)
44 (32 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 5
at 0x484A074: realloc (vg_replace_malloc.c:1690)
by 0x401F0B: ??? (in /home/dridi/src/tsess/tsess)
by 0x402075: ??? (in /home/dridi/src/tsess/tsess)
by 0x42B3FC: ??? (in /home/dridi/src/tsess/tsess)
by 0x430182: ??? (in /home/dridi/src/tsess/tsess)
by 0x47F907: ??? (in /home/dridi/src/tsess/tsess)
by 0x47D2D5: ??? (in /home/dridi/src/tsess/tsess)
by 0x48F631: ??? (in /home/dridi/src/tsess/tsess)
by 0x408738: ??? (in /home/dridi/src/tsess/tsess)
by 0x4897149: (below main) (libc_start_call_main.h:58)
16,384 bytes in 1 blocks are definitely lost in loss record 5 of 5
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: ??? (in /home/dridi/src/tsess/tsess)
by 0x4602A1: ??? (in /home/dridi/src/tsess/tsess)
by 0x408D51: ??? (in /home/dridi/src/tsess/tsess)
by 0x408733: ??? (in /home/dridi/src/tsess/tsess)
by 0x4897149: (below main) (libc_start_call_main.h:58)
LEAK SUMMARY:
definitely lost: 16,416 bytes in 2 blocks
indirectly lost: 12 bytes in 1 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 44 bytes in 2 blocks
suppressed: 0 bytes in 0 blocks
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
The main reason why I didn’t dive straight away in memory leaks is the
uselessness of this report. From the stack trace I can easily infer which
lines are inside tsess
’ main()
function but after that, I wasn’t going to
follow all the paths that could lead to an allocation after “that many”
function calls for each leak. Especially since that would also mean following
paths into the standard library. I’d rather have a computer do this repetitive
work for me.
I’m also worried about this 16kB allocation. What did I do so wrong to leak
16kB of memory when this Valgrind experiments loads one configuration file
that has a size below 1kB? Did I forget to free a buffer for buffered I/O?
That’s highly unlikely since tsess
is very conservative with allocations.
The reason why the Rust experiment failed was caused by my insistence in
having one allocation per configuration file, and working with in-place
substrings.
There should be no such buffer.
I was also wondering where the two reachable memory outstanding allocations
came from. I was well-aware of one source of allocations that could be held
until completion, and it happens to be again in env.ha
: the configuration
directory is the expansion of ${XDG_CONFIG_HOME}/tsess
and this is done
during startup by an @init
function:
let tsess_dir = "";
@init fn env_init() void = {
env_args.dir = arg_resolve(env_vars.dir, env_defs.dir);
env_args.tmux = arg_resolve(env_vars.tmux, env_defs.tmux);
env_args.tmpdir = arg_resolve(env_vars.tmpdir, env_defs.tmpdir);
tsess_dir = strings::concat(env_args.dir: str, "/tsess");
};
It could even fall back to an expansion of ${HOME}/.config
with the result
expanded to ${HOME}/.config/tsess
, leading to two reachable outstanding
allocations instead of one. But based on the environment in which Valgrind was
running, I had an explanation for only one reachable remain in the report.
My initial initial verdict on Hare safety was that it really sucked to have bugs in your code, and I was looking forward to proper debug symbols. Except that at the time I first looked at memory leaks I had proper DWARF support in my Hare tool chain, and I found how to take advantage of it by accident.
Preparing the second release
As I was heading towards a tsess-0.2
release, planets aligned and I was able
to upgrade to Hare 0.24.0 with as little effort as I was planning to invest.
The qbe
package was already a snapshot of “almost qbe 1.2” and the harec
package got upgraded to 0.24.0 so with a minimal change in my local packaging
I was able to rebuild an up-to-date hare
package.
For the tsess-0.2
release I toyed with the -R
option for hare build
and
eventually left a build tree configured in release mode. Once I finally got
motivated to revisit the Valgrind report, but this time I got this:
HEAP SUMMARY:
in use at exit: 88 bytes in 4 blocks
total heap usage: 101 allocs, 97 frees, 45,995 bytes allocated
12 bytes in 1 blocks are indirectly lost in loss record 1 of 4
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: rt.malloc (malloc+libc.ha:7)
by 0x4146BD: strings.dup (dup.ha:15)
by 0x42DC60: fs.dirent_dup (types.ha:168)
by 0x42D674: fs.readdir (util.ha:93)
by 0x430E8C: os.readdir (os.ha:32)
by 0x4633E3: bin.tsess.config.env_load (env.ha:119)
by 0x460DB1: bin.tsess.config.list (list.ha:16)
by 0x47A413: .main (tsess.ha:42)
by 0x408738: main (start+libc.ha:20)
19 bytes in 1 blocks are still reachable in loss record 2 of 4
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: rt.malloc (malloc+libc.ha:7)
by 0x414CF5: strings.concat (concat.ha:10)
by 0x465FC2: bin.tsess.config.arg_expand (env.ha:39)
by 0x466143: bin.tsess.config.arg_resolve (env.ha:29)
by 0x466270: initfunc.0 (env.ha:18)
by 0x408D51: rt.init (initfini.ha:9)
by 0x408733: main (start+libc.ha:19)
25 bytes in 1 blocks are still reachable in loss record 3 of 4
at 0x484280F: malloc (vg_replace_malloc.c:442)
by 0x401F22: rt.malloc (malloc+libc.ha:7)
by 0x414CF5: strings.concat (concat.ha:10)
by 0x466543: initfunc.0 (env.ha:21)
by 0x408D51: rt.init (initfini.ha:9)
by 0x408733: main (start+libc.ha:19)
44 (32 direct, 12 indirect) bytes in 1 blocks are definitely lost in loss record 4 of 4
at 0x484A074: realloc (vg_replace_malloc.c:1690)
by 0x401F0B: rt.realloc (malloc+libc.ha:21)
by 0x402075: rt.ensure (ensure.ha:24)
by 0x42D696: fs.readdir (util.ha:93)
by 0x430E8C: os.readdir (os.ha:32)
by 0x4633E3: bin.tsess.config.env_load (env.ha:119)
by 0x460DB1: bin.tsess.config.list (list.ha:16)
by 0x47A413: .main (tsess.ha:42)
by 0x408738: main (start+libc.ha:20)
LEAK SUMMARY:
definitely lost: 32 bytes in 1 blocks
indirectly lost: 12 bytes in 1 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 44 bytes in 2 blocks
suppressed: 0 bytes in 0 blocks
One things stands out immediately: the report becomes actionable with proper
stack traces. It becomes apparent that my initial assessment on the reachable
bits was wrong: one was the expansion of ${HOME}/.config
and the other one
was the one I identified. That left only one direct memory leak, with no trace
of the 16kB leak. Since the -R
option omits the debug
package, I can only
come to the conclusion that the remaining leaks were in the standard library.
Plugging the final leak
The bug fix for this second leak looks a lot like the fix from the first one:
$ git show --format=short 74c8d4cf5c94be9c41ef2615e1421b7f1d27809c
commit 74c8d4cf5c94be9c41ef2615e1421b7f1d27809c
config: Plug leak spotted by Valgrind
diff --git a/bin/tsess/config/env.ha b/bin/tsess/config/env.ha
index de1b1d2..aa05b8a 100644
--- a/bin/tsess/config/env.ha
+++ b/bin/tsess/config/env.ha
@@ -122,6 +122,7 @@ export fn env_load(sess: *[]sess) (void | src_error) = {
case let d: []fs::dirent =>
yield d;
};
+ defer fs::dirents_free(list);
for (let i = 0z; i < len(list); i += 1) {
if (!fs::isfile(list[i].ftype))
continue;
The problem with this one-liner is that I distinctly remember adding this line when I first started listing files from the configuration directory. But prior to this commit there are no traces of it. My only explanation is a Vim or Git accident, most likely an undo operation (simply pressing ‘u’) that flew under my radar.
This is a shiny example of something that wouldn’t have happened in Rust. I
would likely never had freed that resource and simply relied on the fact that
list
would have been dropped at the end of its scope.
A quick check of memory leaks, without worrying about reachable memory:
valgrind --tool=memcheck --leak-check=full ./tsess list
And as expected, I get a clean report:
HEAP SUMMARY:
in use at exit: 44 bytes in 2 blocks
total heap usage: 101 allocs, 99 frees, 45,995 bytes allocated
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 44 bytes in 2 blocks
suppressed: 0 bytes in 0 blocks
Closing words
The key takeaways in my opinion are first that the only mistakes I made were
the same: I forgot to immediately free resources with the defer
keyword just
after allocating them. I need to grow this reflex of planning a deferred free
on the very next statement following an allocation. Even though in one case I
did, but did not notice its absence during a review. Second, Hare leaves me to
my own devices when it comes to resource management, but unlike C it offers a
precious defer
keyword that deals with multipath pitfalls. Third, as proven
with Valgrind, I have some tricks up my sleeves to mitigate the inevitable
mistakes I make. Fourth, the Hare model appears to be a successful recipe, as
I was able to limit actual leaks to two spots for approximately 3000 lines of
Hare code. And my last key takeaway is that despite learning Hare, I don’t
remember ever wondering what I was doing with memory, unlike my experience
with Go.
I need to mention that static linking has a benefit that dynamic linking lacks out of the box. At some point I tried to both defer a free and manually do it. Hare’s allocator warned me about a potential double free, and was of course spot on.
My actual initial verdict on Hare safety is very positive.
Or is it?
Fast forwarding to version 0.4
A couple releases later, tsess
is still lacking a test suite. Thus, there
has been no attempt at systematizing safety checks. As a result, another leak
was introduced, unsurprisingly.
This is not a problem for several reasons, ranked by importance:
tsess
is designed for short-lived executions, no lingering leaks- once the testing strategy is in place, it will include safety checks
tsess
has not reached 1.0, whether in scope or quality
This time the leak was not a missing defer
statement, so I can maybe claim
that this reflex is settling in. It is probably too soon to tell anyway.
I also extended my safety checks to more than listing configurations and found new worrying complaints from Valgrind.
Not having time to focus on all of them I narrowed the easiest one down:
$ cat unsafe.ha
use fmt;
use unix;
export fn main() void = {
let unsafe: (rune | void) = '\n';
if (unix::getuid() != 0)
unsafe = void;
if (unsafe == '\n')
fmt::println("ERROR")!;
};
$ hare build -qR -lc -o unsafe unsafe.ha
$ valgrind --tool=memcheck --leak-check=full ./unsafe
==601071== Memcheck, a memory error detector
==601071== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==601071== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==601071== Command: ./unsafe
==601071==
==601071== Conditional jump or move depends on uninitialised value(s)
==601071== at 0x44628B: .main (unsafe.ha:8)
==601071== by 0x408738: main (start+libc.ha:20)
==601071==
==601071==
==601071== HEAP SUMMARY:
==601071== in use at exit: 0 bytes in 0 blocks
==601071== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==601071==
==601071== All heap blocks were freed -- no leaks are possible
==601071==
==601071== Use --track-origins=yes to see where uninitialised values come from
==601071== For lists of detected and suppressed errors, rerun with: -s
==601071== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
One thing that surprised me when I looked at my code was that it made a direct
comparison between a tagged union type (rune | void)
with a rune
instead
of using a match
expression to decompose it. I read the Hare specification
but it took me months and I do not remember the rules allowing expressions
involving a union with one of its types without a match
, a cast or the as
keyword. I don’t even remember from reading the specification that it was
possible in the first place, so I probably forgot that I was dealing with a
tagged union when I did that.
The good news is that this is probably a false positive, at least as far as
Hare is concerned. (Future me says no.) My assumption is that the offending
if
expression reads more than the type tag for its comparison, but at least
it does not print an error so we get the expected behavior. (Future me says
no.) Valgrind will stop complaining if the second if
expression is turned
into a match
so there is still something amiss. (Future me shouts YES!)
The first if
expression checks the PID to make sure the second assignment is
not optimized away.
How do tagged union types actually work?
I explored several aspects of Hare as I was learning it, like for example how the moving parts of the build system are articulated. From my recollection of the Hare specification, the layout of tagged union types is an implementation detail, but how does the reference implementation manage them?
I tried to look at the generated assembly code to figure this out:
$ rm -rf ~/.cache/hare/tmp/
$ hare build -vvR -lc -o unsafe unsafe.ha 2>&1 | grep 'unsafe\.ha/.*\.s$'
as -o /home/dridi/.cache/hare/tmp/tmp.qYgG01hZNn/unsafe.ha/896798450034f4518c1b81f1230065157390a134955ec9fc76f91d699574c45b.o.tmp /home/dridi/.cache/hare/tmp/tmp.qYgG01hZNn/unsafe.ha/c3dd2ba88647977896af5bed73e7cbfb2ab8560fc3652a21996b94efcd048d45.s
Since this is the assembly code generated by qbe
, it uses the AT&T notation
I’m not familiar with. This is
hopefully trivial enough for me… I mean, it can hardly be more trivial, can
it? I digress.
And with that, let’s look at the beginning of the main()
function:
.main:
pushq %rbp
movq %rsp, %rbp
subq $112, %rsp
.loc 2 4 25
.loc 2 5 12
movl $1737287038, -56(%rbp)
.loc 2 5 40
movl $10, -52(%rbp)
The second movl
is for the '\n'
assignment, a rune
is a 32bit integer.
My logical conclusion is that 1737287038
is the tag for the rune
type
either in Hare in general, in this program, or for this specific tagged union
type (rune | void)
. After all, they both refer to line 5 where the variable
is declared and first assigned.
If there is indeed an access to uninitialized memory, does it mean that the
“value” for void
is stored separately? Let’s skip ahead to line 7:
.loc 2 7 30
movl $3650376889, -112(%rbp)
.loc 2 7 30
movq -112(%rbp), %rax
movq %rax, -56(%rbp)
First, we write the new tag 3650376889
for the void
type in some scratch
space on the stack. Then we copy 64bits to RAX
and finally we copy the tag
and the void
“value” where the unsafe
variable is stored.
If we check the beginning of the main()
function again, and the code I’m
omitting, we can confirm that we copied 4 initialized bytes for the tag along
with 4 uninitialized bytes in RAX
, and then smuggled them in unsafe
.
So I can make a few observations. As usual, Valgrind was correctly complaining about uninitialized memory and I didn’t expect to prove it wrong. The layout of a tagged union type appears to be a 32bit tag followed by what looks like a C union. This was at least what I remembered from the presentation linked in this blog post (but I don’t have time for a rewatch).
I think that what’s happening is that a tag is set in the scratch space and
that the void
type has a zero size. So we end up with an uninitialized read
because the assignment will copy the total union size. And finally, because
the scratch space appears to be reserved at the beginning of the stack for the
main()
function, one tagged assignment could initialize this space and turn
this into a false negative from Valgrind.
Another obvious consequence is that the rune
check could accidentally match
a 32bit uninitialized value already present on the stack.
Checking the spurious match hypothesis
Let’s tweak the program to see whether we can display “ERROR” in the standard output:
$ cat spurious.ha
use fmt;
use unix;
export fn main() void = {
let unsafe: (rune | void) = '\0';
if (unix::getuid() != 0)
unsafe = void;
if (unsafe == '\0')
fmt::println("ERROR")!;
};
$ hare build -qR -o spurious spurious.ha
$ ./spurious
ERROR
Hypothesis confirmed.
Checking the false negative hypothesis
Let’s tweak the program to see whether we can silence Valgrind without solving the actual problem:
$ cat smuggle.ha
use fmt;
use unix;
export fn main() void = {
let unsafe: (rune | void) = '\0';
let unrelated: (rune | void) = void;
if (unix::getpid() != 0)
unrelated = 'u';
if (unix::getuid() != 0)
unsafe = void;
if (unsafe == 'u')
fmt::println("ERROR")!;
};
$ hare build -qR -lc -o smuggle smuggle.ha
$ valgrind --tool=memcheck --leak-check=full ./smuggle
==606695== Memcheck, a memory error detector
==606695== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==606695== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==606695== Command: ./smuggle
==606695==
ERROR
==606695==
==606695== HEAP SUMMARY:
==606695== in use at exit: 0 bytes in 0 blocks
==606695== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==606695==
==606695== All heap blocks were freed -- no leaks are possible
==606695==
==606695== For lists of detected and suppressed errors, rerun with: -s
==606695== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Hypothesis confirmed.
Not only did the unrelated
variable assignment hide the uninitialized
access, it also managed to smuggle the 'u'
character in the unsafe
variable, even tagged as void
. It was smuggled in the sense that the
offending if
expression doesn’t seem to check the tag and take this
rune
at face value.
Initial verdict on Hare safety
My initial verdict on Hare safety was very positive, and I can’t retroactively change that.
Did my new findings change my opinion on Hare safety? Not really. This finding was really interesting to dive into, being both really small and really trivial (emphasis on really).
I think that the (void | rune) == rune
comparison should be a compile error,
and I should probably submit a bug report. Valgrind is complaining about other
things that are a little more worrying, but I already burnt a lot of my spare
time looking into this one and writing about it.
I’m not too fond of the shared scratch space for tagged (and possibly other) assignments, but there are probably constraints I’m not aware of that lead to this design. I’m also not really familiar to how other compilers deal with their stack.
It should be noted that I found this with
Hare 0.24.0, and
even though the project is already quite advanced, it’s probably fair to say
that Hare is still somewhat in its infancy. And I don’t expect 1.0 to somehow
turn the reference implementation into perfect software. Software implies bugs
and that’s a fact of life. Hare still offers me an interesting alternative to
C without the complications of Rust (that I see more as a C++
alternative).
While Rust’s compiler has to behave like a static analyzer out of the box, I also believe that Hare generally leaves much less opportunities to lose track of the flow of a program, so writing a static analyzer for Hare would likely allow more accurate results than what state-of-the-art static analyzers for C can manage based on my unscientific wet-finger estimate, but I digress…
My verdict is still very positive.