akaros.git
2 years agovmrunkernel: allow -M for setting memory start
Ronald G. Minnich [Mon, 14 Nov 2016 23:18:12 +0000 (15:18 -0800)]
vmrunkernel: allow -M for setting memory start

And, as part of finding compiler warnings to make
this work, do some cleanup.

Oh, and as part seeing the help message
was woefully wrong, fix that too by having it
print the contents of the options struct,
not a string that will keep getting wrong :-)

Change-Id: I98b25095ff2f1255afbf1257d56197b1f6bc8d08
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[formatting nits]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdding documentation for using adt and gerrit to Contributing.md
Gan Shun [Wed, 9 Nov 2016 22:36:17 +0000 (14:36 -0800)]
Adding documentation for using adt and gerrit to Contributing.md

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I4b58d08e88d570c1d237f7cb14ef79fd21654940
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmrunkernel: remove statically allocated _kernel[]
Ronald G. Minnich [Thu, 3 Nov 2016 20:05:49 +0000 (13:05 -0700)]
vmrunkernel: remove statically allocated _kernel[]

kernel memory is now dynamically allocated.
It always starts at 16 MiB, a good choice for linux.
It defaults to 1GiB but you can change the size
via -m.

The startup code makes sure that __procinfo.program_end
is < 16 MiB, and that 16 MiB + memsize does not intrude into
BRK_START.

We also don't use MAP_FIXED. Rather, we test after
the mmap that we got the address we want. This
ensures that we got our mapping and that we did
not get it at the expense of unmapping something else.
It's a more conservative test than using MAP_FIXED
and testing for MAP_FAILED.

Tested to booting a linux kernel.

Change-Id: I6dc2c8e729f27c143e38f53a229e84ab145fb051
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdded ADT script
Gan Shun [Wed, 2 Nov 2016 17:35:23 +0000 (10:35 -0700)]
Added ADT script

This allows us to easily push to gerrit and set up custom reviewers and
topics. The topic defaults to the local branch name unless otherwise
specified.

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I841ed157ef6d663d718368652654b0b6039bdc7a
[removed blank at EOF]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmrunkernel: load the file using the ELF library
Ronald G. Minnich [Tue, 1 Nov 2016 16:41:38 +0000 (09:41 -0700)]
vmrunkernel: load the file using the ELF library

This has been used to boot a full Linux kernel environment
to multiuser.

Change-Id: I9ba0ef062f05994225358e92a24de2d7934c8cd9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd a script to help review gerrit patch sets
Barret Rhoden [Tue, 1 Nov 2016 17:36:56 +0000 (13:36 -0400)]
Add a script to help review gerrit patch sets

Like git track-review, this grabs a branch for a gerrit change (with git
gerrit-track), extracts it into patches, and runs checkpatch.

It could just as easily call git checkpatch, but breaking it into .patches
helps a little.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoReorder the top level Makefile so that full builds work again
Ronald G. Minnich [Mon, 31 Oct 2016 22:56:28 +0000 (15:56 -0700)]
Reorder the top level Makefile so that full builds work again

Otherwise, they fail, as gelf.h is not installed when
make tests runs.

Change-Id: If19d8515706a7a43ccd37bf2e60fbf88ce4cd581
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoDocfix: changed obj/kernel to obj/kern
Fergus Simpson [Tue, 1 Nov 2016 00:43:30 +0000 (17:43 -0700)]
Docfix: changed obj/kernel to obj/kern

There is no kernel folder in obj, just kern.

Change-Id: Id3f901fc0c347cb5e0c5fa220ce83f7338199770
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd script to track a particular gerrit change
Gan Shun [Fri, 28 Oct 2016 19:30:27 +0000 (12:30 -0700)]
Add script to track a particular gerrit change

This script pulls the latest patch-set from gerrit for a particular
change and creates a branch

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I56120268935ecca38b978a8e519d5cab6430e70f

2 years agoMove the BRK_START to a fixed, safe address (XCC)
Barret Rhoden [Wed, 26 Oct 2016 19:19:07 +0000 (15:19 -0400)]
Move the BRK_START to a fixed, safe address (XCC)

The VM code often wants to mmap blobs at various fixed addresses, such
as the guest kernel.  Our old glibc heap would start right at the top of
the program's loading point, which meant that we couldn't safely use any
of that memory.  The current vmrunkernel just has a huge array that
covers the memory regions it expects to use.  This is less than ideal.

This commit just specifies a region of the process's virtual address
space that glibc will use for its sbrk() allocations (e.g. malloc()).
Any program can safely mmap with MAP_FIXED below this address (up to the
binary's end point, which the kernel reports in procinfo->program_end.

Here's a before and after.  Note the old 0x21000 bytes has moved from
0x647000 to its new location at 0x100000000000.

bash-4.3$ cat /proc/self/maps
00100000-00120000 rwxp 00000000 01:00 146 /lib/ld-2.19.so
00320000-00321000 r--p 00020000 01:00 146 /lib/ld-2.19.so
00321000-00322000 rw-p 00021000 01:00 146 /lib/ld-2.19.so
00322000-00323000 rw-p 00000000 00:00 0 [heap]
00400000-00443000 r-x- 00000000 01:00 102 /bin/busybox
00443000-00444000 r-xp 00043000 01:00 102 /bin/busybox
00643000-00644000 rw-p 00043000 01:00 102 /bin/busybox
00644000-00647000 rw-p 00000000 00:00 0 [heap]
00647000-00668000 rwx- 00000000 00:00 0 [heap]
400000000000-400000001000 rw-p 00000000 01:00 146 /lib/ld-2.19.so
400000001000-400000002000 rw-p 00000000 00:00 0 [heap]
400000002000-400000141000 r-xp 00000000 01:00 182 /lib/libc-2.19.so
400000141000-400000341000 ---p 0013f000 01:00 182 /lib/libc-2.19.so
400000341000-400000345000 r--p 0013f000 01:00 182 /lib/libc-2.19.so
400000345000-400000347000 rw-p 00143000 01:00 182 /lib/libc-2.19.so
400000347000-40000034a000 rw-p 00000000 00:00 0 [heap]
40000034a000-40000034b000 rw-p 00000000 00:00 0 [heap]
40000034b000-40000034f000 rw-- 00000000 00:00 0 [heap]
40000034f000-400000351000 rwx- 00000000 00:00 0 [heap]
400000351000-400000353000 rw-- 00000000 00:00 0 [heap]
7f7fff8ff000-7f7fff9ff000 rw-- 00000000 00:00 0 [heap]

bash-4.3$ cat /proc/self/maps
00100000-00120000 rwxp 00000000 01:00 146 /lib/ld-2.19.so
00320000-00321000 r--p 00020000 01:00 146 /lib/ld-2.19.so
00321000-00322000 rw-p 00021000 01:00 146 /lib/ld-2.19.so
00322000-00323000 rw-p 00000000 00:00 0 [heap]
00400000-00443000 r-x- 00000000 01:00 102 /bin/busybox
00443000-00444000 r-xp 00043000 01:00 102 /bin/busybox
00643000-00644000 rw-p 00043000 01:00 102 /bin/busybox
00644000-00647000 rw-p 00000000 00:00 0 [heap]
100000000000-100000021000 rwx- 00000000 00:00 0 [heap]
400000000000-400000001000 rw-p 00000000 01:00 146 /lib/ld-2.19.so
400000001000-400000002000 rw-p 00000000 00:00 0 [heap]
400000002000-400000141000 r-xp 00000000 01:00 182 /lib/libc-2.19.so
400000141000-400000341000 ---p 0013f000 01:00 182 /lib/libc-2.19.so
400000341000-400000345000 r--p 0013f000 01:00 182 /lib/libc-2.19.so
400000345000-400000347000 rw-p 00143000 01:00 182 /lib/libc-2.19.so
400000347000-40000034a000 rw-p 00000000 00:00 0 [heap]
40000034a000-40000034b000 rw-p 00000000 00:00 0 [heap]
40000034b000-40000034f000 rw-- 00000000 00:00 0 [heap]
40000034f000-400000351000 rwx- 00000000 00:00 0 [heap]
400000351000-400000353000 rw-- 00000000 00:00 0 [heap]
7f7fff8ff000-7f7fff9ff000 rw-- 00000000 00:00 0 [heap]

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove proc->heap_top
Barret Rhoden [Wed, 26 Oct 2016 19:40:40 +0000 (15:40 -0400)]
Remove proc->heap_top

Unused.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove init=/bin/sh from vmimage_cmdine
Gan Shun [Wed, 26 Oct 2016 18:08:58 +0000 (11:08 -0700)]
Remove init=/bin/sh from vmimage_cmdine

We no longer need that to boot all the time, so I'm removing it from the
defaults.

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I18e7023475a8de4abf0588dbc7c298ccd6632e89
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoDelete unsupported entries for userspace MSR handling.
Gan Shun [Wed, 26 Oct 2016 18:08:57 +0000 (11:08 -0700)]
Delete unsupported entries for userspace MSR handling.

We can't handle most of these emulated MSRs because we don't actually read
and write the MSR in userspace. Removing them from the emmsr array

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I127adf7ef346df7a5aeb3959b4b41afc25921c49
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix IA32_MISCENABLE disabling of PEBS
Gan Shun [Wed, 26 Oct 2016 18:08:56 +0000 (11:08 -0700)]
Fix IA32_MISCENABLE disabling of PEBS

We weren't correctly checking the written value. We tell the guest that
PEBS is disabled, thus when they write the same value back to the MSR, we
should check for the disable bit in miscenable

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I0e00119d7fec678e2c4e3b2185565444022ac140
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAHCI: Prevent sign extension of partial address
Fergus Simpson [Thu, 20 Oct 2016 19:00:55 +0000 (12:00 -0700)]
AHCI: Prevent sign extension of partial address

Drive reads were not working past the 1 TiB mark because the resulting
address was negative. This was determined to be an issue with an
unsigned char getting sign extended when bit shifted into an int64_t.
It is now cast to a uint32_t after the shift to prevent sign extension.
The container was also changed from int64_t to uint64_t.

Change-Id: I590b0da4fd0c02b0e2542a0b65bde510bba89525
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Skip device permission check
Fergus Simpson [Tue, 18 Oct 2016 00:11:26 +0000 (17:11 -0700)]
AHCI: Skip device permission check

Permissions are not currently implemented in Akaros. This change simply
makes it so that devpermcheck(...) always returns before throwing an
error that would result in permission being denied. This allows the
device to actually be used while even though permissions have not been
implemented.

Change-Id: Ic2f19071803bba497d916031a22bcbc0b70e8ffd
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Add C600 HBA and fix PCI iteration bugs
Fergus Simpson [Tue, 18 Oct 2016 00:09:30 +0000 (17:09 -0700)]
AHCI: Add C600 HBA and fix PCI iteration bugs

This commit adds the C600 HBA to the list of recognized Intel HBAs so my
machine with one can use it and makes two fixes to the PCI driver.

It also fixed a bug in the PCI driver. When detecting PCI devices it
iterates over all functions on all devices. A device can have up to 8
functions (0-7) and the driver assumes they are sequential, giving up
when one is not found. This should not be done. A device is detected by
whether function 0 is implemented - if it is not no device is connected.
While a device must implement function 0, it does not need to implement
its other functions sequentially. The C600 for example implements 0, 2,
3, so the driver did not detect functions 2 and 3 and the HBA did not
work. The driver has been changed so that it will only give up if
function 0 is not found.

Another issue was fixed with the PCI driver where it would not detect
devices on bus 0xff - the last bus. There was a comment about issues
with bus 0xff but that doesn't seem to be an issue any more so the
driver will now check the last bus.

Change-Id: I8dcac3f27b4983a9141e5700d73a758389cef75a
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Replace MMIO accesses with helper functions
Fergus Simpson [Mon, 17 Oct 2016 22:08:51 +0000 (15:08 -0700)]
AHCI: Replace MMIO accesses with helper functions

This commit removes all pointer accesses to MMIO by removing the
structs that represented MMIO. They have been replaced by helper
functions that use volatile accesses to make sure that the reads
and writes always happen. Instead of structs, blocks of memory are
simply used that are indexed into using constants that represent each
register. All virtual addresses are represented by void pointers, and
all physical addresses would be represented by uintptr_t types; however,
all physical addresses are stored in MMIO and hence are only accessed
with the provided helper functions. They are only written to as
addresses needed by the HBA. The host keeps its own references to the
structs as void pointers to virtual memory.

Change-Id: Ia62cd57797ca8db9f21f47559c524149ad6fc11e
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Fix hardware address gets in driver
Fergus Simpson [Mon, 17 Oct 2016 21:55:51 +0000 (14:55 -0700)]
AHCI: Fix hardware address gets in driver

The AHCI driver was using PCIWADDR(ptr) to get the physical address of
memory mapped structs, but only assinging it to the lower 32 bits of
any address field and setting the upper 32-bits to 0. AHCI's memory
mapped structs use 32-bit regsters so both halves are stored in
sequential registers.

This fix uses paddr_low32(ptr) and paddr_hgih32(ptr) to get both halves
of the address.

This should fix issues that occur when a memory mapped struct is outside
of the 32-bit address space.

Change-Id: I8e5ef62c580cc002510ccabadef9c2fcf0153bc8
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Remove struct typedefs from driver
Fergus Simpson [Mon, 17 Oct 2016 21:54:12 +0000 (14:54 -0700)]
AHCI: Remove struct typedefs from driver

This makes the driver more consistent with the rest of Akaros's code.

Change-Id: I427e439ee1b34a2bcf5ec86c94e4f590c0681ee5
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: get it to build and almost work.
Ronald G. Minnich [Mon, 17 Oct 2016 21:36:44 +0000 (14:36 -0700)]
AHCI: get it to build and almost work.

In qemu, it still shows the device as having zero bytes.
But it does find it.

Change-Id: I81262b460a9cd43a848c1d782c109ec216afb795
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agomlx4: /dev/ -> /dev_vfs/
Barret Rhoden [Tue, 18 Oct 2016 18:21:57 +0000 (14:21 -0400)]
mlx4: /dev/ -> /dev_vfs/

Fixes the mlx4 driver in accordance with commit 9724f9a56650 ("Move VFS
/dev/ -> /dev_vfs/").

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoConvert the capability device to use SHA1
Ronald G. Minnich [Fri, 14 Oct 2016 22:43:03 +0000 (15:43 -0700)]
Convert the capability device to use SHA1

This involves a minor code change but I take the opportunity
to clean things up, getting rid of files we don't need,
and fixing includes.

Change-Id: Ie9ead4b6a2473d2f25b7b0a777343aef598f8dd9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability device: get it to compile
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:18 +0000 (13:26 -0700)]
capability device: get it to compile

We need to do something about the use of sha1.

Change-Id: I80795609ccea1ac629cb7b9d4a95040cc040d76a
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[whitespace]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability: run scripts/PLAN9 on capability device
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:17 +0000 (13:26 -0700)]
capability: run scripts/PLAN9 on capability device

Change-Id: I55dbed3e636730c4768c61168f22c61c9e2c82fb
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[sizeof ()s]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability: clang-format the capability device
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:16 +0000 (13:26 -0700)]
capability: clang-format the capability device

Change-Id: I3e99b8317fc57fbfb775fd4242e5fb2f36411a46
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoadd the capability device from Harvey (from Plan 9)
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:15 +0000 (13:26 -0700)]
add the capability device from Harvey (from Plan 9)

Change-Id: If159e72517809eedd0d1e98271e3dde57e035090
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: get sha256 support to build.
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:04 +0000 (13:39 -0700)]
crypto: get sha256 support to build.

For now we'll just go with the sh256.c. That said,
we'll keep the other bits in here. Sooner or later we may
need the other crypto functions. Note these are not compiled
in conditionally.

We should consider removing the conditional compiling
of the unrolled code; we don't have space constraints of firmware.

Change-Id: Ic792cf2b89fa4f01a94c420eb3c620b62c7bf2a9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: move includes to kern/include
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:03 +0000 (13:39 -0700)]
crypto: move includes to kern/include

Change-Id: Id9e62496bb6595a7f282dfa26bd1fa1cbdac8bb4
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: initial import of the chromeos vboot libraries
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:02 +0000 (13:39 -0700)]
crypto: initial import of the chromeos vboot libraries

This code is needed to support the capability device, imported
in a separate commit. This is recommended as a 'best' version
of these algorithms by a security expert at Google.

This is from  https://chromium.googlesource.com/chromiumos/platform/vboot_reference
ref 3b55afa94e84c91874fcdad352b4053036886aa7

Change-Id: Ie3d90f183df990fd5bde6dfd83efbbd1e9b6009b
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoifconfig: invoke ipconfig with `-P` when configuring loopback.
Dan Cross [Fri, 7 Oct 2016 20:18:30 +0000 (16:18 -0400)]
ifconfig: invoke ipconfig with `-P` when configuring loopback.

Don't overwrite cached data from the DHCP server.

Change-Id: Ie7d3ad4be5d9cf6aeb4def7d8c47ffefe522c80d
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix a minor bug in `ipfconfig` and clean up some logic.
Dan Cross [Fri, 7 Oct 2016 20:13:20 +0000 (16:13 -0400)]
Fix a minor bug in `ipfconfig` and clean up some logic.

If the lease time is 1, then we wouldn't wait; that's a bug.
Clean up an obnoxious conditional.

Change-Id: I25ad3c5ac3510d56a0dc3d37b464ca002236875b
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove our glibc poll implementation (XCC)
Barret Rhoden [Fri, 1 Jul 2016 20:25:29 +0000 (16:25 -0400)]
Remove our glibc poll implementation (XCC)

The old one just immediately returned.  Now that we have a version of
poll() in iplib, that one would have overridden glibc's.  However, if we
messed up and didn't link with iplib, then we'd silently be using the old
broken glibc version again.  This way, we'll catch it with a stub warning.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoLink busybox with iplib
Barret Rhoden [Thu, 7 Jul 2016 16:37:57 +0000 (12:37 -0400)]
Link busybox with iplib

It needs it to use our poll, instead of glibc's - which will soon be a
stub.

Rebuild busybox.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd nonblocking reads and FD taps to #cons/stdin
Barret Rhoden [Wed, 5 Oct 2016 21:02:04 +0000 (17:02 -0400)]
Add nonblocking reads and FD taps to #cons/stdin

This allows select/poll/epoll of stdin, which a few apps want to do.  The
change to consstat is so that select() can detect if the console is
readable.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a devstat helper
Barret Rhoden [Thu, 6 Oct 2016 18:44:37 +0000 (14:44 -0400)]
Add a devstat helper

Devices can use this if they want to do build their own stat functions,
given a dir.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoMove VFS /dev/ -> /dev_vfs/
Barret Rhoden [Wed, 5 Oct 2016 20:22:41 +0000 (16:22 -0400)]
Move VFS /dev/ -> /dev_vfs/

Now that stdin/out/err are not in the VFS, we can move the VFS device
directory and get rid of that nasty sys_open() hack.

Code that uses devices, such as the mlx4 user-driver, need to look in
dev_vfs now.  mlx4 and blockdev still use devfs.c.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoMove stdin/stdout/stderr to #cons
Barret Rhoden [Wed, 5 Oct 2016 19:47:24 +0000 (15:47 -0400)]
Move stdin/stdout/stderr to #cons

It's the same logic, just accessible via 9ns instead of VFS.  #cons/null
already existed.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove the old console input code; use qio
Barret Rhoden [Wed, 5 Oct 2016 16:16:41 +0000 (12:16 -0400)]
Remove the old console input code; use qio

This removes all of console.{c,h}, replacing its functionality with a
basic qio queue in devcons.

Other than using qio instead of the homebrewed rings and sems, this
also uses qiwrite directly from interrupt context.  This avoids an
excessive kernel message.

There were also a couple monitor-related commands sitting around in
console.{c.h}, which I moved to monitor files.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix a few debugging tools
Barret Rhoden [Tue, 4 Oct 2016 19:35:18 +0000 (15:35 -0400)]
Fix a few debugging tools

These are minor changes that helped with debugging.  The asserts in qio are
an attempt to debug I panic I got only once.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Only fire writable taps on edge transitions
Barret Rhoden [Thu, 6 Oct 2016 15:56:58 +0000 (11:56 -0400)]
qio: Only fire writable taps on edge transitions

We were firing writable taps (via the qwake_cb()) any time someone read
from a queue.  The effect of this was that applications that tapped their
Qdata FD would see a lot of writable taps firing, even though there wasn't
an edge transition.

For instance, say a conversation's write queue (outbound, TX) is no where
near full.  The app puts a packet in the queue.  When the network stack
drains the block from the queue with __qbread(), that will trigger a
writable tap.  So the app gets an FD tap / epoll every time it writes a
packet.  Incidentally, that behavior helped me track down a bug, but it
isn't what we're looking for.

Like the read side, we only fire on edge transitions, as done in commit
dbaaf4a3029e ("qio: Fire read taps on actual edges").  Back then, I had us
firing writable taps all the time, which was a bit much.

Note that we still fire the readable/writable taps regardless of
Qstarve/Qflow.  Those queue state flags only get set when someone tries to
read/write a queue and fails.  The taps we fire occur independently, which
is why their logic (e.g. was_empty / was_unwritable) are separate from the
rendez control variables (e.g. dowakeup).  This is probably right, since
it's possible for an application to know a queue would block without trying
(perhaps through stat()).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Add a check to pullupblock
Barret Rhoden [Tue, 4 Oct 2016 19:32:58 +0000 (15:32 -0400)]
qio: Add a check to pullupblock

This delays the impending doom associated with BLOCK_EXTRA_DATA.  It's
relatively easy to trigger the problem if the block len (and block list
len) is < n.  Just write gibberish into a UDP data FD!

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClose alarm FDs on fork()
Barret Rhoden [Fri, 30 Sep 2016 20:11:51 +0000 (16:11 -0400)]
Close alarm FDs on fork()

If a parent has alarm FDs, forks, but doesn't exec, then its child will
inherit its alarm FDs.  Other than the child being able to mess with the
parent's alarms, which is bad, the parent is unable to fully be freed (as
in __proc_free()) until the child closes the FD - usually by exiting.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix potential overflow error in CEQs (XCC)
Barret Rhoden [Tue, 4 Oct 2016 18:06:06 +0000 (14:06 -0400)]
Fix potential overflow error in CEQs (XCC)

The issue was that a consumer that came in during overflow recovery could
see that there was no overflow and return FALSE, meaning the CEQ was empty,
even though there were older messages.

Consider, the kernel already posted two messages, set overflow, and the
ring is empty:

Thread 1                      Thread 2
--------                      --------
see empty ring                see empty ring
see overflow is on
grab lock
clear overflow
extract a message
                              sees overflow is off
                              returns FALSE
sets overflow
unlocks
returns TRUE

And there's still a message in the CEQ that thread 2 should have grabbed.

While doing this change, I also changed nr_events to an unsigned.  That was
my original intent (based on the usage in epoll), and making the change now
keeps this commit from changing the size of the CEQ, which keeps everyone
from having to rebuild every application.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAvoid needless TLB flush when restarting kthreads
Barret Rhoden [Mon, 3 Oct 2016 19:27:10 +0000 (15:27 -0400)]
Avoid needless TLB flush when restarting kthreads

If we're about to run on a core where our address space was arleady loaded,
we don't need to reload it.  Doing so actually triggers a TLB flush.

Regarding changing this comment:

/* In the future, we could check owning_proc. If it isn't set, we
 * could clear current and transfer the refcnt to kthread->proc. */

Although that is true, it's a bit dangerous and we'd need to measure to
know if its worth the hassle.  The intent was that if owning_proc !=
current, then this kthread is running 'detached' from its process.  This
could be a syscall that briefly woke up and went back to sleep.  We could
avoid the incref and another decref shortly (when current gets cleared in
smp_idle()->abandon_core()) by transferring the ref.

The issue is that when we clear current, we also need to load a different
page table, since it's possible that the process will be freed before this
core ends up running another page table.  So we could do this optimization,
but then we'd need to load a page table, which is a TLB flush.  Then maybe
we'd be switching back to the process again, since we don't know that the
*next* kthread to run isn't also for this process.  So it's not clear that
avoiding the atomic ops (incref/decref) and moving up the TLB flush was
worth the hassle.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix clobber of current in kthread.c
Barret Rhoden [Mon, 3 Oct 2016 19:03:50 +0000 (15:03 -0400)]
Fix clobber of current in kthread.c

Originally, there wasn't a KTH_SAVE_ADDR_SPACE flag.  When I added that, I
didn't update this code.  The resulting bug was that if we had to undo a
kthread swap, that kthread was for a ktask (which doesn't have a proc), and
we had a process's address space loaded, then we'd clobber current
(clearing it).  That would result in a reference counting problem, since we
effectively deleted a counted reference to whatever process was current.
I'd see this on occasion under heavy networking and process load.

This also clears kthread->proc whenever the kthread is not blocked.
Previously, we were leaving the value of the uncounted proc reference.  The
code was okay, but it was surprising when debugging and was a source for
potential bugs.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoDelay clearing owning proc in sys_exec
Barret Rhoden [Fri, 30 Sep 2016 20:30:07 +0000 (16:30 -0400)]
Delay clearing owning proc in sys_exec

If we do it before any of the return calls, we could end up returning to
userspace while owning_proc isn't set.  I think the rest of the kernel is
able to handle this, but there's no sense messing around.  The old comment
makes it sound like we can block in that state too, which is probably true,
but returning by anything other than the error path ways seems like a bad
idea.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoChange syscall usec timeouts to unsigned longs
Barret Rhoden [Fri, 30 Sep 2016 20:20:46 +0000 (16:20 -0400)]
Change syscall usec timeouts to unsigned longs

I noticed this due to some sys_block calls having a 'negative' argument in
the printout (due to the %d in the saved string).

While I was here, I also changed halt_core, though note that that timeout
was more of a 'future plan', I think.  The code doesn't use it.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse proc_decref() in #proc
Barret Rhoden [Fri, 30 Sep 2016 18:04:27 +0000 (14:04 -0400)]
Use proc_decref() in #proc

Use the helper instead of accessing the kref directly.  That helps with
debugging.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd trace_printf()
Barret Rhoden [Fri, 30 Sep 2016 16:34:57 +0000 (12:34 -0400)]
Add trace_printf()

This is a helper for userspace to print into the kernel's trace_printk()
log.  It's extremely useful for fast print debugging.

The trace_printk() log currently just maintains the last N entries, with
older entries replaced by newer ones.  You can cat it or tail it
repeatedly.  The log file is usually at /prof/kptrace.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoepoll: Set up the alarm_evq at init time
Barret Rhoden [Wed, 28 Sep 2016 16:21:01 +0000 (12:21 -0400)]
epoll: Set up the alarm_evq at init time

This way we don't need to alloc and free it repeatedly for timeouts.  The
main benefit for this now is that we actually leak memory when we free the
evqs in epoll.c (grep TODO.*INDIR).  This prevents long-running processes
from eventually running out of memory.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a helper for async syscalls
Barret Rhoden [Wed, 28 Sep 2016 16:18:52 +0000 (12:18 -0400)]
Add a helper for async syscalls

This helper makes an async syscall that will trigger the event queue upon
completion.  The caller doesn't check for completion manually - wait for
the ev_q.  This helps a few async syscall use cases, and avoids the need to
register the evq (CAS and whatnot) after submitting the syscall.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoepoll: Clean up epoll_wait and stop excess polling
Barret Rhoden [Tue, 27 Sep 2016 18:23:05 +0000 (14:23 -0400)]
epoll: Clean up epoll_wait and stop excess polling

The old for loop would keep polling up to maxevents.  As soon as it fails
once, we should stop.  At that moment, the CEQ was empty and we should
either block or return.

Also, this fixes a subtle issue.  If we extracted a message but it didn't
have an epoll event, we were still advancing 'i', which means we'd have a
hole in our events (so, a gibberish event) and we'd skip the last event.
Alas, this was *a* bug, but not the bug I was looking for.

This also cleans up a bit of the logic for after we block, thanks to the
__epoll_wait_poll helper.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoepoll: Fix event clobber
Barret Rhoden [Tue, 27 Sep 2016 15:48:51 +0000 (11:48 -0400)]
epoll: Fix event clobber

It was possible, though I never saw it, for an event entry to be clobbered.
Say we extracted a CEQ message for a particular FD.  That set the events
field in the epoll_event.  Then we attempt to extract another CEQ message,
possibly intending for another FD in the epoll set.  Instead, we get
another message for that same FD that we already set an event for.  When we
set that event, we clobber the original one.

The fix is to accumulate events for a given FD for all CEQ messages.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoifconfig: use daemonize for cs, remove busy-waiting loop
Dan Cross [Thu, 6 Oct 2016 19:12:21 +0000 (15:12 -0400)]
ifconfig: use daemonize for cs, remove busy-waiting loop

Now that `cs` understands the daemonize protocol, use it in
the `ifconfig` script.  Remove the busy-waiting loop waiting
for the /srv/cs file to appear.

Change-Id: I06db794b38ad50957c56668f7a8cef807d54101c
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocs.c: Add an option to participate in the daemonize protocol.
Dan Cross [Thu, 6 Oct 2016 19:12:20 +0000 (15:12 -0400)]
cs.c: Add an option to participate in the daemonize protocol.

Add an option to make cs.c participate in the `daemonize` protocol:
it will signal completion by sending an event to it's parent; this
instead of busy-waiting on the creation of a srv file.

Change-Id: Ibd44c7352ca3e71621255db0dac9069178b1f845
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocs.c: Use strlcpy to copy strings.
Dan Cross [Thu, 6 Oct 2016 19:12:19 +0000 (15:12 -0400)]
cs.c: Use strlcpy to copy strings.

Change-Id: I310486e539c9f587efc381d7f5a78d7fd5d9a3de
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocs.c: Fix all checkpatch warnings and errors.
Dan Cross [Thu, 6 Oct 2016 19:12:18 +0000 (15:12 -0400)]
cs.c: Fix all checkpatch warnings and errors.

The changes in this commit are all removal of checkpatch
warnings and errors.

Change-Id: I5ca1c72ce34bfb2087ad948846b8af9732fe9c6c
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocs.c: `main` returns `int`, use getopt().
Dan Cross [Thu, 6 Oct 2016 19:12:17 +0000 (15:12 -0400)]
cs.c: `main` returns `int`, use getopt().

Change-Id: If1da944eac61c762ce16ac9f1df755439250406c
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocs.c: Run clang-format on this file.
Dan Cross [Thu, 6 Oct 2016 19:12:16 +0000 (15:12 -0400)]
cs.c: Run clang-format on this file.

Change-Id: I82fc00318101fb321112361dadda497f48861e8c
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoModify 'ipconfig' to send an event to 'daemonize'.
Dan Cross [Thu, 6 Oct 2016 19:12:15 +0000 (15:12 -0400)]
Modify 'ipconfig' to send an event to 'daemonize'.

As per the previous commit, modify `ipconfig` to participate
in the new `daemonize` protocol: use pthreads for the ipv6 router
advertisement/solitication and ipv6 DHCP loop maintenance threads.
Send an event to our parent when we exit or go into the background.

Change `ifconfig` to use daemonize and the new ipconfig facility.

Change-Id: I60735d0427c5ad580522cd7e5ac210f8d2d5ae4b
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd 'daemonize': a program that spawns a process and waits for an event
Dan Cross [Thu, 6 Oct 2016 19:25:09 +0000 (15:25 -0400)]
Add 'daemonize': a program that spawns a process and waits for an event

Consider the typical 'fork-without-exec' pattern: often, a program
wants to do some initialization, acquire some resources, etc, and
then background itself by forking and exiting in the parent.  We
don't have great support for this, as e.g. reference counts on resources
created by the parent can keep processes alive when they should die.
(cf, `ipconfig` currently, where we acquire a DHCP lease in the
foreground before going into a loop in the background to maintain
the lease, but the parent process stays in the "DYING" state forever.)

Barret and I discussed this at length.  A possible solution is to have
a program that invokes the target process in the background and then
waits for an event from that process and then itself exits.  Note that
this requires a protocol between the two: any place where we exit in
the child needs to send an event, and the parent needs to get an event
or it blocks.

I think a generalization of 'wait' where we can send arbitrary messages
with payloads, more along the lines of our existing event support but
specialized to process control, would be nice but it's more work than
what we've got right now.  Think of something analogous to 'wait' but
that accepts a payload, paired with a 'background' call of some kind
that signals to a possibly-waiting parent that we're going off on our
own.

Change-Id: I531f1be3076786a638d57db1c660f9b768f5545e
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove `SYS_getpid` system call. (XCC)
Dan Cross [Wed, 5 Oct 2016 18:01:44 +0000 (14:01 -0400)]
Remove `SYS_getpid` system call. (XCC)

Remove the redundant and now-unused `getpid` system call:
the replacement is a function in the C library that simply
retrieves the process ID form __procinfo.

Reinstall your kernel headers.

Change-Id: Ib0649a17c2a7daf1f01194545d4122daf25f9e25
Signed-off-by: Dan Cross <crossd@gmail.com>
[XCC]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove `sys_getpid` stubs from parlib.
Dan Cross [Wed, 5 Oct 2016 17:44:49 +0000 (13:44 -0400)]
Remove `sys_getpid` stubs from parlib.

This system call is going away in favor of `getpid` from the
C library, which simply retrieves the PID from __procinfo.
Remove the stubs that call into the kernel from parlib.

Rebuild and reinstall parlib.

Change-Id: Id498e5f63c8d75302410444a0d8dd4f259cf5b34
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove references to sys_getpid from a variety of tests.
Dan Cross [Wed, 5 Oct 2016 17:43:08 +0000 (13:43 -0400)]
Remove references to sys_getpid from a variety of tests.

This system call is going away in favor of `getpid` from the C
library, which just retrieves the PID from the __procinfo region.
Remove references to it.

Change-Id: I25e5ea28647bc25a726985c01511e00d2d816285
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove call to 'getpid' system call from tls.h. (XCC)
Dan Cross [Wed, 5 Oct 2016 17:33:45 +0000 (13:33 -0400)]
Remove call to 'getpid' system call from tls.h. (XCC)

This call was only made to cross the user/kernel boundary
when mucking about with TLSes.  If we really still have to do
that, then use SYS_null instead.

Rebuild glibc.

Change-Id: I465d958f0a837da62096aa3a8d1054bdc0b99d94
Signed-off-by: Dan Cross <crossd@gmail.com>
[XCC]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd 'getppid()' to glibc to grub the ppid out of __procinfo (XCC)
Dan Cross [Wed, 5 Oct 2016 16:44:23 +0000 (12:44 -0400)]
Add 'getppid()' to glibc to grub the ppid out of __procinfo (XCC)

Add the `getppid()` function to glibc to retrieve a process's
parent process ID.  See POSIX:
http://pubs.opengroup.org/onlinepubs/009695399/functions/getppid.html

Rebuild and reinstall glibc.

Change-Id: I116129292da56bfd53cc2e6a23d9c9bb7c6246d1
Signed-off-by: Dan Cross <crossd@gmail.com>
[checkpatch/format]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoExport epoch time via proc_global_info (XCC)
Barret Rhoden [Thu, 22 Sep 2016 18:15:50 +0000 (14:15 -0400)]
Export epoch time via proc_global_info (XCC)

The kernel was internally maintaining basically the same structure that
benchutil/alarm was maintaining: a mapping from epoch time to TSC ticks.
Now that info is exported to userspace.

This allows us to implement gettimeofday() and clock_gettime() in
userspace, which means we can remove SYS_gettimeofday.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse a "one block at a time" policy for snoop queues
Barret Rhoden [Wed, 21 Sep 2016 19:49:21 +0000 (15:49 -0400)]
Use a "one block at a time" policy for snoop queues

This fixes snoopy's packet loss.

Snoopy expects to receive a packet at a time.  If you send more than one
packet in a read, the later packets get dropped.  We could change snoopy,
but it is a little easier to tell the kernel to just return one block
(packet) at a time when we do a qread.

In part, I went with this approach since we'll probably want more kernel
support for packet tracing, not less.  For instance, right now, snoopy's
time stamps are taken in userspace when the queue is read, not when the
packet is received or sent.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Add helpers to toggle state
Barret Rhoden [Wed, 21 Sep 2016 18:12:50 +0000 (14:12 -0400)]
qio: Add helpers to toggle state

I have a use case where I want to toggle Qmsg | Qcoalesce on at runtime,
specifically for using snoopy.  This is probably safe to do, though if you
have code that expects Qmsg, then suddenly it is turned off/on, it might
get confused by a short read.  So be careful.

With these two flags, I can turn a queue into "one non-zero block at a
time" mode.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAllow snooping of the loopback medium
Barret Rhoden [Wed, 21 Sep 2016 19:58:41 +0000 (15:58 -0400)]
Allow snooping of the loopback medium

Snoopy either directly tries to clone a devether conversation or opens a
 #ip/ipifc/snoop file.  The only medium that supported that snoop was
pktmedium.  This commit adds support for loopback.

Incidentally, I'm not sure why devether doesn't use the ethermedium snoop.
Maybe it has something to do with setting promiscuous.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse a helper for tracing an interface
Barret Rhoden [Wed, 21 Sep 2016 19:40:06 +0000 (15:40 -0400)]
Use a helper for tracing an interface

I'll use this shortly for loopback.  This also keeps track of dropped
traces, which will help when wondering if snoopy isn't reporting
everything.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix etheriq()'s extra-data problems
Barret Rhoden [Wed, 21 Sep 2016 19:10:52 +0000 (15:10 -0400)]
Fix etheriq()'s extra-data problems

Or at least try to fix them.  The old code was clearly broken if you passed
in a block with extra data during tracing or any other snooping (including
virtio-net).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Fix copyblock()
Barret Rhoden [Wed, 21 Sep 2016 18:48:36 +0000 (14:48 -0400)]
qio: Fix copyblock()

Previously, copyblock() couldn't handle extra data.

The only caller was passing the full block length, so let's just always do
the full amount.  At this point, it's basically the same as
linearizeblock(), minus the block swapping.  So we can implement
linearizeblock() with copyblock().

Note that Qsnoop uses copyblock(), making a copy of the block to pass up
to a snooping process (e.g. snoopy).  I considered making something like
qclone, where we clone the block instead - using pointers to the original
block's memory.  This would probably work with our current setup, but if we
ever use user memory for the block extra data segments, then we'd have a
problem.  The issue is one of immutability of data, and if we are using
user memory, then we have no guarantee of immutability (unless it's
read-only or something).  For now, we can just make a copy.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Fix minor bugs
Barret Rhoden [Wed, 21 Sep 2016 17:59:19 +0000 (13:59 -0400)]
qio: Fix minor bugs

Changing the state is racy, since it could muck with the receive path code
that also toggles flags.  If you made a qnonblock() call at the same time
as a qbread/qbwrite that changed the state flags, you could corrupt the
flags.

n -> len wouldn't even compile.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a hexdump printf specifier for userspace
Barret Rhoden [Mon, 19 Sep 2016 18:57:40 +0000 (14:57 -0400)]
Add a hexdump printf specifier for userspace

You need to register it, then your program can use it, like so:

printf("foo %p: %.*H", foo, sizeof(*foo), foo);

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a useful errstr in dev.c
Barret Rhoden [Mon, 19 Sep 2016 16:51:40 +0000 (12:51 -0400)]
Add a useful errstr in dev.c

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAvoid locking in sbrk during early SCP (XCC)
Barret Rhoden [Mon, 19 Sep 2016 16:39:31 +0000 (12:39 -0400)]
Avoid locking in sbrk during early SCP (XCC)

This is related to replacing glibc's LLLs with PDR locks.

If, for whatever reason I don't fully understand, a binary links against
the real PDR locks (and not the internal ones in glibc for ld.so), then it
might try to use them before TLS is initialized.  Then it'll die when
checking in_vcore_context(), since the TLS descriptor is 0.

For future reference, here was the backtrace:

uth_disable_notifs+0x4
spin_pdr_lock+0x11
sbrk+0x3a
__libc_setup_tls+0xa9
__libc_start_main+0x11e
_start+0x29

Perhaps setup_tls doesn't always sbrk?  Or the linkage with parlib caused
the spin_pdr_symbol to be overridden?

Either way, we can ignore locking during early SCP context, so this change
should be safe.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a printx lock
Barret Rhoden [Mon, 19 Sep 2016 15:38:31 +0000 (11:38 -0400)]
Add a printx lock

When debugging and relying on printed output, if you have a process that is
spamming the console, it can be hard to see what the kernel is printing
out.

This lock is meant to be used in debugging.  If you have printx turned on,
(px from the monitor), then the console prints will be synchronized with
other uses of px_lock().

For instance, if you want to drop a backtrace at some point, you could do:

sys_foo():
if (some_condition && printx_on) {
px_lock();
backtrace_user_ctx(current, current_ctx);
px_unlock();
}

You'd never commit that blob of code, but it's useful when tracking down a
bug.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix non-UDP 'from' in recvfrom() (XCC)
Barret Rhoden [Mon, 19 Sep 2016 15:30:39 +0000 (11:30 -0400)]
Fix non-UDP 'from' in recvfrom() (XCC)

getsockname() gets *our* name, not the peer's name.  We wanted the peer's
name.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix minor endian issue (XCC)
Barret Rhoden [Mon, 19 Sep 2016 15:29:33 +0000 (11:29 -0400)]
Fix minor endian issue (XCC)

That first assignment to sin_port in recvfrom was overwritten and
confusing.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse Linux's network headers in glibc (XCC)
Barret Rhoden [Fri, 16 Sep 2016 20:38:53 +0000 (16:38 -0400)]
Use Linux's network headers in glibc (XCC)

A bunch of programs use these headers for things like socket options.
We hardly support any of them, but over time we can add those that make
sense to our socket shims.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoSet the socket family in recvfrom() (XCC)
Barret Rhoden [Wed, 14 Sep 2016 18:32:03 +0000 (14:32 -0400)]
Set the socket family in recvfrom() (XCC)

We were setting the port and address, but not the family.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove netinet.h from the kernel
Barret Rhoden [Fri, 16 Sep 2016 21:00:48 +0000 (17:00 -0400)]
Remove netinet.h from the kernel

That was an ancient, unused header file that cluttered up the space.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up x86_64 sysdeps (XCC)
Barret Rhoden [Mon, 12 Sep 2016 20:15:08 +0000 (16:15 -0400)]
Clean up x86_64 sysdeps (XCC)

Our sysdep.h was just including Linux's, which was a mild surprise when
trying to debug the PTR_MANGLE / atexit() bug.  We don't need a lot of the
stuff that is Linux dependent.

Our sysdep.h now just contains a few things copy-and-pasted from the Linux
header, and it doesn't include things like __NR_pread, LOAD_REGS, and other
syscall-ABI specific things.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoSet the glibc thread's pointer_guard (XCC)
Barret Rhoden [Mon, 12 Sep 2016 20:09:33 +0000 (16:09 -0400)]
Set the glibc thread's pointer_guard (XCC)

Every thread needs to have the same pointer_guard.  We inherit it from the
creating parent (TLS, really).  And while we're here, set the stack_guard
too.  This is what glibc does when it creates a pthread.

All threads must have the same pointer_guard.  The issue is that if a
thread other than thread0 sets an atexit() function pointer, then the
pointer gets mangled with the wrong pointer_guard value.  Then when thread0
exits, it will improperly demangle the value.

I considered just turning off pointer mangling, but this is fine as is.

Note that we set the pointer_guard on every TLS, not on every thread.  This
is fine.  Basically this is a global value that every context should see,
so it's fine for it to be in every TLS.  I think glibc doesn't really make
a distinction between TLS and threads; every TLS region has the glibc
thread struct sitting at fs:0, for instance.

I'm not sure why glibc didn't use a global.  Maybe it'd be easier for an
attacker to find the global than the TLS value (to do their own
demangling).  Though fs:0x30 is probably just as easy to find, and you
don't have to do any linking to find the global.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse PDR locks for glibc's internal locks (XCC)
Barret Rhoden [Fri, 9 Sep 2016 18:49:06 +0000 (14:49 -0400)]
Use PDR locks for glibc's internal locks (XCC)

Our glibc port was using a simple spinlock for its LLL (low-level lock).
The problem with this is that the lock could be grabbed by vcore context
code, and in general is not safe from preemption.

The fix is to use PDR locks.  There are a couple nasty details.

One is that glibc mostly assumes a lock is an int.  We could hack up the
sysdeps completely (note that the INITIALIZER is compared to 0 directly),
but that's a mess.  Instead, we rely on the fact that spin PDR locks are 32
bits.

The other detail is that ld.so uses the locks, and it doesn't link with
parlib.  Even if it did, or if we moved the PDR locks to glibc directly,
it'd possibly be a mess, since ld grabs the locks before any of our parlib
constructors (I think).  The way it works now is that ld.so uses the
internal versions of the locks, and anything that links against parlib
(i.e. a binary that ld loads) should get the parlib version.  This might
not be working exactly as I think: see my notes in parlib-compat.c for
details.

A minor point to note is the removal of parlib/common.h from x86/atomic.h.
The common.h header pulls in way too much, and now that glibc's LLL needs
parlib/spinlock.h, we run into build issues.  So don't add things to
low-level parlib header files unnecessarily.

Rebuild the world.  AFAIK, even dynamically linked apps need to be rebuilt,
otherwise they may use the version of the locks that ld.so uses.  For
single-threaded apps, this is probably okay.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd spin_pdr_trylock
Barret Rhoden [Thu, 8 Sep 2016 17:52:44 +0000 (13:52 -0400)]
Add spin_pdr_trylock

Note I didn't use trylock for the spin_pdr_lock implementation.  If we know
we want to lock, no matter what, then we don't want to bother disabling and
enabling notifs repeatedly.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix spinlock_trylock's return value
Barret Rhoden [Thu, 8 Sep 2016 17:49:27 +0000 (13:49 -0400)]
Fix spinlock_trylock's return value

We should be returning TRUE when we successfully lock the lock, not EBUSY.
This is saner, and more in line with other locking APIs.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove the NO_CAS version of spin_pdr locks
Barret Rhoden [Thu, 8 Sep 2016 16:38:16 +0000 (12:38 -0400)]
Remove the NO_CAS version of spin_pdr locks

If we ever update the RISC-V port, we can either build CAS out of their
LL/SC or do something else.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAllow uth_disable_notifs without current_uthread
Barret Rhoden [Thu, 8 Sep 2016 19:31:29 +0000 (15:31 -0400)]
Allow uth_disable_notifs without current_uthread

Upcoming changes to glibc's low-level locks will allow disable notifs to be
called before current_uthread is set up.

An unforunate side-effect of allowing this is that we lose the ability to
catch certain bugs (i.e. no long have an assert).  Also, if current_uthread
gets set in the middle of a disable/enable pair, then the
notif_disabled_depth and the associated logic will go crazy.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix sbrk's lock initialization (XCC)
Barret Rhoden [Thu, 8 Sep 2016 19:18:35 +0000 (15:18 -0400)]
Fix sbrk's lock initialization (XCC)

We were getting away with the uninitialized lock since a value of 0 was OK
for the current LLL locks.

I spotted this when trying to use PDR locks, and LD was flipping out on a
bare-bones replacement for spin_pdr_locks.  A while loop and a CTRL-B
backtrace idenfitied sbrk() as the culprit.

Rebuild glibc, if you want.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix include paths in lock_test
Barret Rhoden [Thu, 8 Sep 2016 17:46:28 +0000 (13:46 -0400)]
Fix include paths in lock_test

This is for building lock_test on Linux.  I missed these paths when we
changed the way user library headers are organized.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAlways provide a user context to signal handlers
Barret Rhoden [Thu, 8 Sep 2016 15:51:01 +0000 (11:51 -0400)]
Always provide a user context to signal handlers

Signal handlers expect some context.  They actually expect a struct
ucontext, which is defined in Glibc.  Further, some programs make
assumptions about the contents of ucontext (and mcontext).  We'll provide
them with a user_context for now.  It's one thing to want a ucontext.  It's
a bit nastier for handlers to demand that it matches the format of whatever
glibc they are using (where the individual fields of an x86_64 mcontext are
API).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a helper for finding current_uthread's context
Barret Rhoden [Thu, 8 Sep 2016 15:47:03 +0000 (11:47 -0400)]
Add a helper for finding current_uthread's context

It's a very easy bug to look into current_uthread to see its context, but
you might be looking at an old context.  Use this helper to find the
correct user context for current_uthread.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoMove the get_user_ctx_* helpers to ros/ (XCC)
Barret Rhoden [Thu, 8 Sep 2016 15:45:08 +0000 (11:45 -0400)]
Move the get_user_ctx_* helpers to ros/ (XCC)

Userspace can use these helpers too, they are fundamentally based on the
structs that are already in the kernel headers, and we might as well have
one copy of them.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoMake signal handler functions vcore-ctx-safe
Barret Rhoden [Wed, 7 Sep 2016 21:02:31 +0000 (17:02 -0400)]
Make signal handler functions vcore-ctx-safe

Akaros makes a distinction between inter-process signals (think kill from
the shell) and intra-process signals (think pthread_kill()).  Intra-process
signals go to uthreads.  Inter-process signals are sent to the entire
process, are global, and are handled by a vcore event handler.

It is possible for signal handlers to call various sig-functions from
signal handlers.  This commit makes those calls safe when called from vcore
context.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse a helper for determining if a uth handles sigs
Barret Rhoden [Wed, 7 Sep 2016 20:59:23 +0000 (16:59 -0400)]
Use a helper for determining if a uth handles sigs

I had to look around the .c file to figure out that the existence of data
meant the uthread was handling signals.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoIn dev_stdout_write, user_strdup_errno should be user_memdup_errno.
Dan Cross [Tue, 13 Sep 2016 19:26:52 +0000 (15:26 -0400)]
In dev_stdout_write, user_strdup_errno should be user_memdup_errno.

We don't use the NUL terminator (presumably) copied by the strdup
variant, and it may not be valid anyway.

Change-Id: Ibce75aaaa8684f2ea5282c53ed8a546c7ff9b724
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agovthread: allow programs to have threads exit on halt.
Ronald G. Minnich [Thu, 8 Sep 2016 16:51:26 +0000 (09:51 -0700)]
vthread: allow programs to have threads exit on halt.

For example, a benchmark declares the vm as follows:
struct virtual_machine vm = {.halt_exit = true,};

Which will force guests that halt to exit.

Change-Id: Ie6368093072f324c86c9ace1807075cd073d540c
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>