akaros.git
4 years agovmmcp: allow a fake write to CSTAR
Ronald G. Minnich [Wed, 8 Jul 2015 15:15:41 +0000 (08:15 -0700)]
vmmcp: allow a fake write to CSTAR

We don't expect that a guest will ever really use this.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: begin cleanup of vmrunkernel; grow memory size.
Ronald G. Minnich [Tue, 7 Jul 2015 23:39:03 +0000 (16:39 -0700)]
vmmcp: begin cleanup of vmrunkernel; grow memory size.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAllow guests to do INVLPG.
Ronald G. Minnich [Tue, 7 Jul 2015 23:38:29 +0000 (16:38 -0700)]
Allow guests to do INVLPG.

We require EPTs so that should be ok.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoVMX: only check the PB VM EC2 if EC1 is ok
Barret Rhoden [Wed, 1 Jul 2015 17:09:01 +0000 (10:09 -0700)]
VMX: only check the PB VM EC2 if EC1 is ok

We should only check the Secondary Processor-Based VM-Execution controls
if the bit is present in the primary controls.  It's one of our
set-to-one bits.

If we do the secondary checks without that magic bit set, we'll GPF.  We
don't have rdmsr_safe() or anything like that either.

Ultimately, once 'ok' is false, we're going to fail anyway.  It's just a
question of how much info we get.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: add "fake write" msr
Ronald G. Minnich [Wed, 1 Jul 2015 16:09:59 +0000 (09:09 -0700)]
vmmcp: add "fake write" msr

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: emulated msr infrastructure
Ronald G. Minnich [Wed, 1 Jul 2015 02:22:57 +0000 (19:22 -0700)]
vmmcp: emulated msr infrastructure

For an emulated msr, we will look for it, then see how much emulation
we want to do. There is a per-msr function which handles read and write.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: set up msr handling framework.
Ronald G. Minnich [Tue, 30 Jun 2015 21:22:29 +0000 (14:22 -0700)]
vmmcp: set up msr handling framework.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: turn off mcp in vmrunkernel for now so we can run on two cores.
Ronald G. Minnich [Tue, 30 Jun 2015 16:37:57 +0000 (09:37 -0700)]
vmmcp: turn off mcp in vmrunkernel for now so we can run on two cores.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: open up cr4; fix cpuid handling
Ronald G. Minnich [Tue, 30 Jun 2015 23:26:15 +0000 (16:26 -0700)]
vmmcp: open up cr4; fix cpuid handling

This gets linux much further.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: allow EFER writes. Clear cr0 shadow to 0s so guest can write all bits.
Ronald G. Minnich [Mon, 29 Jun 2015 23:26:15 +0000 (16:26 -0700)]
vmmcp: allow EFER writes. Clear cr0 shadow to 0s so guest can write all bits.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoPrint information about msr settings that don't quite work out.
Ronald G. Minnich [Mon, 29 Jun 2015 20:45:27 +0000 (13:45 -0700)]
Print information about msr settings that don't quite work out.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agovmmcp: add higherkernbase, more debugging.
Ronald G. Minnich [Mon, 29 Jun 2015 17:27:56 +0000 (10:27 -0700)]
vmmcp: add higherkernbase, more debugging.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRename pgoffset -> pg_num in load_one_elf()
Barret Rhoden [Mon, 2 Nov 2015 16:58:11 +0000 (11:58 -0500)]
Rename pgoffset -> pg_num in load_one_elf()

We use offset to mean the offset into the page.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix parlib/assert.h's warn()
Barret Rhoden [Mon, 2 Nov 2015 15:36:25 +0000 (10:36 -0500)]
Fix parlib/assert.h's warn()

Glibc's err.h has its own warn, which we need to override.  If someone
would include err.h, it would collide with parlib's warn.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAvoid double declarations, integer overflow, and use branch hints
Davide Libenzi [Fri, 16 Oct 2015 17:04:11 +0000 (10:04 -0700)]
Avoid double declarations, integer overflow, and use branch hints

Avoid double declarations, integer overflow, and use branch hints.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
[Kept the double declaration, added const for rwaddr]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded new kernel test case
Davide Libenzi [Fri, 16 Oct 2015 00:13:49 +0000 (17:13 -0700)]
Added new kernel test case

Added new kernel test case for the exception table fixup code.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
[Touched up checkpatch complaint]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoPlugged the exception handling code
Davide Libenzi [Thu, 15 Oct 2015 22:29:20 +0000 (15:29 -0700)]
Plugged the exception handling code

Plugged the exception handling code into the Akaros trap handling
path.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded safe user memory access APIs
Davide Libenzi [Thu, 15 Oct 2015 22:26:00 +0000 (15:26 -0700)]
Added safe user memory access APIs

Added safe user memory access APIs, which allows kernel code to
copy data to and from user memory, with zero cost on the fast path.
The exception table facility can also be used in other cases, where
we are executing a potentially faulting instruction.
The code is coming from the Linux kernel version 3.11.10, most of
it from the arch/x86/include/asm/uaccess.h include file.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
[Touched up checkpatch complaint and compiler.h]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded #ifdef wrapping to prevent double function definitions
Davide Libenzi [Fri, 16 Oct 2015 16:37:13 +0000 (09:37 -0700)]
Added #ifdef wrapping to prevent double function definitions

Added #ifdef wrapping to prevent double function definitions, which
were triggered by umem.h usage in uaccess.h

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded heapsort utility function to the lib/ framework.
Davide Libenzi [Thu, 15 Oct 2015 22:23:09 +0000 (15:23 -0700)]
Added heapsort utility function to the lib/ framework.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix waserror/lock order
Davide Libenzi [Sat, 17 Oct 2015 22:01:42 +0000 (15:01 -0700)]
Fix waserror/lock order

Fix waserror/lock order in order to enforce the acquire before waserror
pattern.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoReplace most uses of strncpy with strlcpy.
Dan Cross [Fri, 16 Oct 2015 16:24:55 +0000 (12:24 -0400)]
Replace most uses of strncpy with strlcpy.

Strncpy has strange and subtle semantics; it was being used
incorrectly in many places. Replace almost everywhere with
strlcpy or memmove.

Note that spatch will in some cases introduce simply incorrect
code when it replaces calls to strcpy; it will take sizeof()
the destination argument, but if that's a pointer, then one
ends up with the size of the pointer type (for our platforms,
8 bytes) instead of the proper size of the destination. When
I saw things like that, I fixed them.

Signed-off-by: Dan Cross <dcross@google.com>
[Minor checkpatch touchups]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoDon't change calls to strcpy() to strncpy(); use strlcpy() instead.
Dan Cross [Thu, 15 Oct 2015 21:13:58 +0000 (17:13 -0400)]
Don't change calls to strcpy() to strncpy(); use strlcpy() instead.

The semantics of strncpy() are confusing. We had been
converting Plan 9 code to Akaros by replacing calls to
strcpy() (which we wisely removed from our kernel some
time ago), but strncpy():

1. Does not necessarily NUL-terminate it's destination
   buffer (e.g., if strlen(src) >= size),
2. Will nul-pad the entire destination string if the
   source string is shorter than the buffer length; this
   is wasteful if we just want proper NUL-termination.

Instead, call strlcpy(), which handles both of the above
cases correctly.

In general, one should not call strncpy() unless one is
sure that they want these semantics.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix strlcpy in kernel, add strlcat.
Dan Cross [Thu, 15 Oct 2015 20:51:55 +0000 (16:51 -0400)]
Fix strlcpy in kernel, add strlcat.

The return value of strlcpy was incorrect: it was returning
the amount it had copied, but the return value is supposed to
be the size of the input string.

I also added strlcat (with a prodigious comment, as the code
is subtle) for parity.

Nothing was checking the return value of strlcpy as far as I
could see, and while it's not specified by e.g. ANSI/ISO, it
still makes sense for us to follow the specification of other
implementations.

Tested: Rebuilt the kernel and ran Akaros.

References:
http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/strlcat.3?query=strlcpy&sec=3
http://www.sudo.ws/todd/papers/strlcpy.html

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoClean up aliased monitor commands
Barret Rhoden [Mon, 26 Oct 2015 21:51:45 +0000 (17:51 -0400)]
Clean up aliased monitor commands

The better way to alias a monitor command is to just point the aliased
entry at the real function, not to implement the function a second time.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAlias "e" to monitor's "exit"
Barret Rhoden [Mon, 26 Oct 2015 21:49:21 +0000 (17:49 -0400)]
Alias "e" to monitor's "exit"

If you have multiple readers of the console, such as busybox and the
monitor, then it's hard to enter "exit."  This way, it's just one key.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAccept more types of FD Taps in #ip
Barret Rhoden [Mon, 26 Oct 2015 19:16:05 +0000 (15:16 -0400)]
Accept more types of FD Taps in #ip

As discussed in commit cfb02bd7818d ("Accept more types of FD Taps in
 #eventfd"), we need to allow some "unimportant" taps, such as PRIORITY
and ERROR.  #ip currently will not trigger any events for these, since
they do not occur in the networking stack (yet), but the user can at
least ask for them.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoImplement recvmsg() (XCC)
Barret Rhoden [Mon, 26 Oct 2015 19:12:07 +0000 (15:12 -0400)]
Implement recvmsg() (XCC)

It's about as standards compliant as recvfrom().

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix UDP socket bug (XCC)
Barret Rhoden [Fri, 23 Oct 2015 18:26:07 +0000 (14:26 -0400)]
Fix UDP socket bug (XCC)

All UDP sockets were pointing at conversation 0.  We would successfully
clone a new UDP conversation, but when we attempted to read the ctl
file, we'd get an empty string back instead of the conversation ID.
That worked if our conversation happened to be 0, but otherwise would
fail.

The root cause was that we advanced the offset by 7 when we wrote the
"headers" command for UDP sockets.  We need to reset the chan back to 0
before reading.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoWeak alias getsockopt() (XCC)
Barret Rhoden [Wed, 21 Oct 2015 15:29:29 +0000 (11:29 -0400)]
Weak alias getsockopt() (XCC)

This allows us to override it in a user library or application, which
tends to be the standard for glibc.  We already do this for
setsockopt(), for instance.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRemove (un)likely from the kernel interface (XCC)
Barret Rhoden [Tue, 20 Oct 2015 19:53:55 +0000 (15:53 -0400)]
Remove (un)likely from the kernel interface (XCC)

Unfortunately, some user libraries out there also #define likely and
unlikely, and those include both ros/common.h and parlib/common.h.
Since we're not using likely/unlikely as part of the kernel interface, I
don't mind moving it out of ros/.  But it's more unfortunate that we
can't put it in a parlib header too.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix Rock sizeof sockaddr bug (XCC)
Barret Rhoden [Tue, 20 Oct 2015 19:25:55 +0000 (15:25 -0400)]
Fix Rock sizeof sockaddr bug (XCC)

Whenever we consider the size of a Rock's sockaddr, we need to use the
sizeof a struct sockaddr_storage.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoCall printf() instead of fprintf in signal.c
Barret Rhoden [Mon, 26 Oct 2015 21:56:24 +0000 (17:56 -0400)]
Call printf() instead of fprintf in signal.c

Signal handlers run in vcore context.  We could page fault on the glibc
printf calls.  Our printf() is vcore-context safe.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoOverride glibc's printf for vcore context
Barret Rhoden [Mon, 26 Oct 2015 21:55:39 +0000 (17:55 -0400)]
Override glibc's printf for vcore context

Our printf is safe from vcore context.  Otherwise, glibc's printf may
run off the end of the vcore stack, depending on whether or not the
output stream is in buffered mode.

This only overrides printf, not functions like fprintf, vprintf, or any
of the other printf functions.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Panic if there is a bad thread state
Barret Rhoden [Mon, 26 Oct 2015 21:44:31 +0000 (17:44 -0400)]
pthread: Panic if there is a bad thread state

There's no reason to merely print here.  If the type is wrong, there is
a critical bug in the 2LS.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoEnsure vcore context code includes parlib/assert.h
Barret Rhoden [Mon, 26 Oct 2015 21:13:15 +0000 (17:13 -0400)]
Ensure vcore context code includes parlib/assert.h

Code that runs in vcore context should call parlib's assert.  Otherwise,
there is a chance glibc's assert will call glibc's printf, which may run
off the end of the stack and page fault.

By putting parlib/assert.h in common.h, any downstream headers, such as
vcore.h, event.t, and parlib.h, will pick up our assert().

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoDo not breakpoint() on parlib's assert
Barret Rhoden [Mon, 26 Oct 2015 20:46:33 +0000 (16:46 -0400)]
Do not breakpoint() on parlib's assert

The breakpoint is very useful for debugging, and people can add it back
in to their local repos when needed, but it's not needed for the usual
assertion failure.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRemove #include <assert.h> from parlib
Barret Rhoden [Wed, 21 Oct 2015 21:52:40 +0000 (17:52 -0400)]
Remove #include <assert.h> from parlib

Other than the #include <assert.h> in parlib/assert.h, the other assert
includes actually do nothing, other than confuse people.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRename parlib/rassert.h -> parlib/assert.h
Barret Rhoden [Wed, 21 Oct 2015 21:47:34 +0000 (17:47 -0400)]
Rename parlib/rassert.h -> parlib/assert.h

Slight change to the #includes in parlib/assert.h, since we are no
longer including Glibc's assert.h when we build parlib.  Other libraries
and tests do include glibc's assert.h.  This is due to the search path
of the user library Makefrag.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoMove parlib's assert guts into a C file
Barret Rhoden [Wed, 21 Oct 2015 17:31:40 +0000 (13:31 -0400)]
Move parlib's assert guts into a C file

Having the machinery of the print and abort in the header file is
problematic for #include loops.  First, we need stdlib included, which
isn't a big deal.  Second, and more troublesome, is we need vcore.h.
That will cause problems with a later commit, where I have all parlib
code use parlib's assert.

ucq.c was relying on a transitive #include for printd.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUndefine static_assert() in parlib/rassert.h
Barret Rhoden [Wed, 21 Oct 2015 17:31:40 +0000 (13:31 -0400)]
Undefine static_assert() in parlib/rassert.h

static_assert() is #defined in glibc's assert.h under certain
circumstances.  We'll just use our own in parlib code.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUse a -D when building parlib
Barret Rhoden [Tue, 27 Oct 2015 18:59:48 +0000 (14:59 -0400)]
Use a -D when building parlib

The -D will be defined so that a parlib header can determine how to
 #include a file.

The search order for user libraries is to first check its include
directory in the source code, and then check the system headers.  This
is important so that we build with the latest headers, i.e. the ones we
just changed.

The problem comes when we want to include a system/glibc header that
happens to have the same name as a parlib header, e.g. assert.h in an
upcoming patch.  From within parlib, we could do an #include_next.  But
for external libraries and apps, that fails since user/parlib/include is
not on their search path.

With this -D, we can tell which situation we're in and #include
accordingly.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Properly change state for running threads
Barret Rhoden [Tue, 20 Oct 2015 18:56:49 +0000 (14:56 -0400)]
pthread: Properly change state for running threads

We were only setting PTH_RUNNING for thread 0 early on.  After that, it
was all PTH_RUNNABLES.  We should now be able to assert that if a thread
is on the active list, it is marked PTH_RUNNING.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Account for pth stopping in has_blocked
Barret Rhoden [Tue, 20 Oct 2015 18:16:19 +0000 (14:16 -0400)]
pthread: Account for pth stopping in has_blocked

This popped up as

uthread.c:621: run_uthread: Assertion `uthread->state == 2' failed.

and

pthread.c:246: pth_sched_entry: Assertion `new_thread->state == 2'
failed.

for an epoll/eventfd app.

The ready and active queues were corrupted, due to adding a pthread to
the ready queue in pth_thread_runnable() that was still on the active
queue.

Anytime a pthread stops and will eventually have pth_thread_runnable()
called, as is the case with blocking a uthread on an event queue (which
epoll does), then the pthread code needs to yank it off the active
queue.  Modifications of pthread->state are a good sign that list
management needs to be done.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Factor out common active_queue code
Barret Rhoden [Tue, 20 Oct 2015 18:47:49 +0000 (14:47 -0400)]
pthread: Factor out common active_queue code

__pthread_generic_yield() isn't the greatest name for this, since it's
used in non-yield paths, but it beats reusing code, and it's not clear
if we'll want other accounting in there.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Fix sem_timedwait() bug
Barret Rhoden [Mon, 19 Oct 2015 15:36:54 +0000 (11:36 -0400)]
pthread: Fix sem_timedwait() bug

__sem_timedblock() takes a sem_queue_element *, not a semaphore *.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoCheck system headers for warnings for userspace
Barret Rhoden [Mon, 19 Oct 2015 15:34:35 +0000 (11:34 -0400)]
Check system headers for warnings for userspace

-Wsystem-headers checks the headers for warnings.  The lack of this flag
was masking a minor bug with TAILQs.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agopthread: Fix semaphore's TAILQ type
Barret Rhoden [Mon, 19 Oct 2015 15:25:56 +0000 (11:25 -0400)]
pthread: Fix semaphore's TAILQ type

The semaphore's TAILQ isn't of pthreads, it's of sem_queue_elements.  I
spotted this while trying to debug something else in the area.  The
warning was hidden because the bad assignment was in a system header.

Specifically, when the TAILQ macros attempt to do some form of
assignment, they should generate an incompatible pointer type warning.
However, sys/queue.h is a system header, and those warnings are ignored
by default.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoPrint the vcoreid for unhandled faults in VC ctx
Barret Rhoden [Mon, 19 Oct 2015 15:21:50 +0000 (11:21 -0400)]
Print the vcoreid for unhandled faults in VC ctx

This is a little more useful when diagnosing bugs.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoProperly include syscall.h in parlib/event.h
Barret Rhoden [Mon, 19 Oct 2015 15:10:45 +0000 (11:10 -0400)]
Properly include syscall.h in parlib/event.h

When building with -Wsystem-headers, we get a warning like:

... /usr/include/parlib/event.h:37:48:
warning: 'struct syscall' declared inside parameter list
 bool register_evq(struct syscall *sysc, struct event_queue *ev_q);

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd the Inferno license to files we got from Inferno
Ronald G. Minnich [Tue, 27 Oct 2015 15:31:50 +0000 (08:31 -0700)]
Add the Inferno license to files we got from Inferno

This is long overdue. We just kept forgetting. But somebody in
Harvey wanted one of our files and at that point it's essential
to get this right.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[minor touchups, added UCB and Google modifications]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd scripts/spelling.txt
Barret Rhoden [Tue, 27 Oct 2015 17:38:52 +0000 (13:38 -0400)]
Add scripts/spelling.txt

This is used by checkpatch, and copied from Linux's commit 69984b644407
("Merge tag 'arm64-fixes' of
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux")

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdjust checkpatch.pl for Akaros's style
Barret Rhoden [Tue, 13 Oct 2015 14:34:04 +0000 (10:34 -0400)]
Adjust checkpatch.pl for Akaros's style

We do a few things differently than Linux.  All are bike-shedable, but
for now, this will allow most of our differences.

Note that checkpatch is a guideline, not a legal document.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoChange checkpatch tab length to 4
Barret Rhoden [Tue, 13 Oct 2015 02:06:13 +0000 (22:06 -0400)]
Change checkpatch tab length to 4

For better or worse, we use 4-space-wide tabs.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd checkpatch.pl
Barret Rhoden [Fri, 9 Oct 2015 19:11:43 +0000 (15:11 -0400)]
Add checkpatch.pl

Imported from Linux, commit c6fa8e6de3dc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix listener / echo servers to handle char mode
Barret Rhoden [Fri, 9 Oct 2015 15:02:10 +0000 (11:02 -0400)]
Fix listener / echo servers to handle char mode

If you telnet in character mode, the server wouldn't output the input,
though it would correctly echo the input back to the client.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoClose Qlisten FDs
Barret Rhoden [Fri, 9 Oct 2015 13:34:33 +0000 (09:34 -0400)]
Close Qlisten FDs

The original ipclose() wasn't built to close Qlistens, which can now be
opened via an O_PATH open.  Prior to O_PATH, if you did an open on
Qlisten, the chan you got back was a Qctl, so we did not need a special
case in ipclose().

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUpdate file-posix.c utest
Kevin Klues [Thu, 15 Oct 2015 05:21:06 +0000 (22:21 -0700)]
Update file-posix.c utest

Update this test to reflect some cahnges in the previous commit. This
utest is failing, however, because the vfs doesn't currently support
openat. Once this is fixed, this test should pass again.

Signed-off-by: Kevin Klues <klueska@cs.berkeley.edu>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUpdate utest infrastructure
Kevin Klues [Thu, 15 Oct 2015 05:19:57 +0000 (22:19 -0700)]
Update utest infrastructure

Updated to allow passing of continuation to run in the case of an assert
failure. Also, we now print the actual test performed in an assert in
the case when no message is passed in.

Signed-off-by: Kevin Klues <klueska@cs.berkeley.edu>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoSet ENOENT for failed 9ns lookups
Barret Rhoden [Thu, 15 Oct 2015 15:21:53 +0000 (11:21 -0400)]
Set ENOENT for failed 9ns lookups

In the changes from error() taking just an errstr to also taking errno,
some of the times we set_errno(ENOENT) were getting clobbered with
ENODEV.

This commit fixed any recent changes to ENOENT that I could find.  There
might be more, but there are no set_errno(ENOENT)s remaining in the 9ns
code.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFixed error() reporting when error codes are reported by called functions as negative...
Davide Libenzi [Thu, 15 Oct 2015 14:53:46 +0000 (07:53 -0700)]
Fixed error() reporting when error codes are reported by called functions as negative errno codes.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoCheck for ctx in default_core_handler()
Barret Rhoden [Thu, 8 Oct 2015 20:15:58 +0000 (16:15 -0400)]
Check for ctx in default_core_handler()

It's possible for a process to receive an event via sys_notify(), which
will not have a ctx.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoDo not free epoll event queues
Barret Rhoden [Thu, 8 Oct 2015 16:09:46 +0000 (12:09 -0400)]
Do not free epoll event queues

As with many race conditions, something that can happen ends up being
quite likely.  With a little parallelism, I'm able to trigger bugs where
an outstanding INDIR pointing to an epoll event queue gets handled after
the ev_q gets cleaned up.

The long term solution is some form of user-deferred cleanup.  For now,
we can throw away a little RAM.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoHandle 0 usec in rendez_sleep_timeout()
Barret Rhoden [Thu, 8 Oct 2015 15:03:23 +0000 (11:03 -0400)]
Handle 0 usec in rendez_sleep_timeout()

Userspace can ask for 0 sec, so an assert is wrong.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAllow freeaddrinfo(NULL) (XCC)
Xiao Jia [Tue, 13 Oct 2015 23:57:00 +0000 (16:57 -0700)]
Allow freeaddrinfo(NULL) (XCC)

On both glibc and uclibc, freeaddrinfo(NULL) is safe, though there is no
explicit requirement for that in the spec:
http://pubs.opengroup.org/onlinepubs/009695399/functions/getaddrinfo.html

So let's change freeaddrinfo() to handle that case.

Rebuild glibc.

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix get_num_numa() loop in x86 topology.c
Kevin Klues [Wed, 7 Oct 2015 15:57:21 +0000 (08:57 -0700)]
Fix get_num_numa() loop in x86 topology.c

The old loop assumed sorted, monotonically increasing numa domains in
the SRlapic tables.  This loop does not.

Signed-off-by: Kevin Klues <klueska@cs.berkeley.edu>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoSync fork/exec() with updates to procinfo/procdata
Kevin Klues [Tue, 13 Oct 2015 04:40:26 +0000 (21:40 -0700)]
Sync fork/exec() with updates to procinfo/procdata

The pipetest added in the previous commit exposes a bug with not copying
enough data from procinfo/procdata into a forked process in cases where
we don't exec after the fork. This commit updates the fork and exec code
to make sure that the relevant fields in procinfo/procdata are copied
over properly on a fork. If we do end up execing, these fields are
reinitialized to their initial values.

Signed-off-by: Kevin Klues <klueska@cs.berkeley.edu>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoSimple pipe test with fork()
Kevin Klues [Mon, 12 Oct 2015 20:49:26 +0000 (13:49 -0700)]
Simple pipe test with fork()

This test forks off a child process and sets up pipes for communication
with it. It tests both pipes and forks of SCPs with blocking system
calls.

Signed-off-by: Kevin Klues <klueska@cs.berkeley.edu>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded set_error() API to have a single point of entry for setting for errno and errstr.
Davide Libenzi [Wed, 7 Oct 2015 20:09:23 +0000 (13:09 -0700)]
Added set_error() API to have a single point of entry for setting for errno and errstr.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAvoid an extra function call on the error frame handling.
Davide Libenzi [Wed, 7 Oct 2015 19:19:41 +0000 (12:19 -0700)]
Avoid an extra function call on the error frame handling.

[Removed extra line at EOF in err.h]

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoChanged a few EFAIL to proper errno codes.
Davide Libenzi [Wed, 7 Oct 2015 17:37:24 +0000 (10:37 -0700)]
Changed a few EFAIL to proper errno codes.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoDropped char* error file to unify common error strings handling.
Davide Libenzi [Wed, 7 Oct 2015 14:30:14 +0000 (07:30 -0700)]
Dropped char* error file to unify common error strings handling.

[Touched up an off-by-one with error <= MAX_ERRNO]

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded explicit errno reporting from error() API.
Davide Libenzi [Wed, 7 Oct 2015 02:46:23 +0000 (19:46 -0700)]
Added explicit errno reporting from error() API.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRemove errstrings.h, in favor of error.c
Kevin Klues [Wed, 7 Oct 2015 21:40:01 +0000 (14:40 -0700)]
Remove errstrings.h, in favor of error.c

The errstrings.h file was only used to generate an error_strings[] table
in error.h.  Instead, we now generate an error.c file which defines a
table similar to this (called errno_strings[]), but is part of a .c file
instead of a .h file. We then extern this table in through error.h. The
error.c file is automatically generated if we ever change ros/errno.h.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoReplaced dummy likely/unlikely definitions with the ones coming from compiler.h.
Davide Libenzi [Thu, 8 Oct 2015 19:35:13 +0000 (12:35 -0700)]
Replaced dummy likely/unlikely definitions with the ones coming from compiler.h.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdded support for static branch hinting (XCC)
Davide Libenzi [Thu, 8 Oct 2015 18:39:56 +0000 (11:39 -0700)]
Added support for static branch hinting (XCC)

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
[Reinstall your kernel headers]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUse process struct flag to indicate tracing instead of scanning an array.
Davide Libenzi [Thu, 8 Oct 2015 19:28:07 +0000 (12:28 -0700)]
Use process struct flag to indicate tracing instead of scanning an array.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRestore flags interrupts on the error path.
Davide Libenzi [Wed, 7 Oct 2015 23:49:51 +0000 (16:49 -0700)]
Restore flags interrupts on the error path.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAvoid void* error buffer declaration
Davide Libenzi [Thu, 8 Oct 2015 21:23:05 +0000 (14:23 -0700)]
Avoid void* error buffer declaration

Place errbuf declaration so that it can be explicitly declared, instead
as void pointer.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
[Touched up commit formatting / typos]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agomlx4: Support transmitting block extra data
Xiao Jia [Wed, 30 Sep 2015 22:06:23 +0000 (15:06 -0700)]
mlx4: Support transmitting block extra data

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoSupport block extra data in adjustblock
Xiao Jia [Tue, 22 Sep 2015 23:45:53 +0000 (16:45 -0700)]
Support block extra data in adjustblock

I tested this change by running

    ping <host>
    ping -s 5000 <host>

and verified both ping executions succeeded.

For the above testing, I also patched en_tx.c temporarily with
linearizeblock.

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd helper function block_append_extra
Xiao Jia [Tue, 22 Sep 2015 23:44:39 +0000 (16:44 -0700)]
Add helper function block_append_extra

It will be used by block extra data fixes.

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoUse block_add_extd retval to detect success or error
Xiao Jia [Wed, 7 Oct 2015 21:29:30 +0000 (14:29 -0700)]
Use block_add_extd retval to detect success or error

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoChange block_add_extd to return success or error
Xiao Jia [Tue, 29 Sep 2015 21:58:35 +0000 (14:58 -0700)]
Change block_add_extd to return success or error

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRefactor freeb to separate out free_block_extra
Xiao Jia [Tue, 22 Sep 2015 23:43:25 +0000 (16:43 -0700)]
Refactor freeb to separate out free_block_extra

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoPrint block content and backtrace in PANIC_EXTRA
Xiao Jia [Fri, 18 Sep 2015 01:22:23 +0000 (18:22 -0700)]
Print block content and backtrace in PANIC_EXTRA

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoMove network config blocks out of ifconfig
Xiao Jia [Tue, 29 Sep 2015 19:48:48 +0000 (12:48 -0700)]
Move network config blocks out of ifconfig

Existing "known good" sections and qemu default are moved to
/etc/network/default.  Users can write their own config blocks in
/etc/network/local which is not under version control.  Users can
also put individual custom stuff under /etc/network/local.d/ which
will be picked up by the ifconfig script.

Signed-off-by: Xiao Jia <stfairy@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAllow fcntl() to toggle O_NONBLOCK
Barret Rhoden [Thu, 1 Oct 2015 14:12:56 +0000 (10:12 -0400)]
Allow fcntl() to toggle O_NONBLOCK

There isn't a good answer yet for whether O_NONBLOCK should be a chan
flag, a device file flag, or both.  For now, I'll let people toggle
O_NONBLOCK.  In the future, we might send a wstat() too and set a file
mode bit (similar to DMAPPEND).

For some devices, like #ip, this will set the chan flag but will have no
effect on the device.  It's not a huge deal, since the sockets shims
intercept fcntl on Rocks and issue the device command.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoCheck MMAP_LOWEST_VA in __is_user_addr()
Barret Rhoden [Thu, 1 Oct 2015 15:14:20 +0000 (11:14 -0400)]
Check MMAP_LOWEST_VA in __is_user_addr()

The main utility for this is in debugging.  We'll be more likely to
distinguish between a kernel page fault due to a buggy kernel from a
page fault due to a bad user.  The latter we'll eventually deal with.
The former requires a bug fix.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoRefactor is_user_r{w,}addr()
Barret Rhoden [Tue, 29 Sep 2015 16:22:33 +0000 (12:22 -0400)]
Refactor is_user_r{w,}addr()

Use a helper for the common code.  I used this briefly when I wanted to
catch null pointers for debugging.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd a diagnostic to print info about a core
Barret Rhoden [Thu, 1 Oct 2015 15:00:09 +0000 (11:00 -0400)]
Add a diagnostic to print info about a core

This popped up as a "would be nice to have" feature while tracking down
a bug.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoInitialize the rwlock in struct pgrp
Barret Rhoden [Thu, 1 Oct 2015 01:20:17 +0000 (21:20 -0400)]
Initialize the rwlock in struct pgrp

This escaped our notice for a while, since the reader CV is only used
when there is a writer.  That requires some R/W contention on the lock.

The way this popped up was if you did two ifconfigs back to back.
ifconfig does some mounting and cs removes old #srv/cs files.  Some
combination of these led to a writer mucking with the NS while another
thread was calling findmount.  The concurrent threads was a critical
part, and things like printks could throw it all off.  Good times.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAccept more types of FD Taps in #eventfd
Barret Rhoden [Tue, 29 Sep 2015 19:09:33 +0000 (15:09 -0400)]
Accept more types of FD Taps in #eventfd

The taps will never fire, but people can at least ask for them.  Asking
for ERROR is actually legitimate in Linux.

In general, being strict with what taps people ask for might be too
harsh, though in general if someone asks for something, they might
actually care if it happens or not.  For instance, we don't want to have
a device that can't do a READABLE tap and then have an app that thinks
it is waiting for READABLE to fire.  It never will, and the application
will hang.  So perhaps being strict on the more *important* taps is the
way to go.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoAdd #cons/urandom
Barret Rhoden [Tue, 29 Sep 2015 15:55:47 +0000 (11:55 -0400)]
Add #cons/urandom

Under the hood, it's the same as #cons/random.  We can look into what we
really need for random and urandom when we overhaul all of the random
generating code.  If anything, it's all urandom, and we'll need a
stronger real 'random.'

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoMake syscall trace records for all copy_path calls
Barret Rhoden [Mon, 28 Sep 2015 20:50:49 +0000 (16:50 -0400)]
Make syscall trace records for all copy_path calls

When we traced an open call, we'd save the path.  There are a lot of
other syscalls that have paths, and we weren't tracing them.  By tracing
during copy_path(), we deal with 90% of those cases.  The other cases
are ones in which there are multiple copy_path() calls per syscall.  In
those cases, the specific syscall can change the t->data string
afterwards, as is done in the case of sys_rename().

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix bugs with syscall trace record data copies
Barret Rhoden [Mon, 28 Sep 2015 20:50:49 +0000 (16:50 -0400)]
Fix bugs with syscall trace record data copies

openat()'s trace was using path_l, instead of the MIN of path_l and the
buffer size.  Whoops - you'd overflow the buffer and cause all sorts of
problems (and only when tracing!).

write()'s trace was using ret, which could be -1.  We want to capture
what was attempted to be written, regardless of failure.  That's 'len',
not ret.

There's also no reason to use memmove.  We know the buffer for the trace
is distinct from its input.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoChange len from int -> size_t for sys_{read,write}
Barret Rhoden [Mon, 28 Sep 2015 21:01:15 +0000 (17:01 -0400)]
Change len from int -> size_t for sys_{read,write}

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoChange sysname to Akaros
Barret Rhoden [Mon, 28 Sep 2015 20:42:47 +0000 (16:42 -0400)]
Change sysname to Akaros

I like how the old default was Windows 95.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoFix benchutil alarm open to use #alarm
Barret Rhoden [Mon, 28 Sep 2015 22:30:03 +0000 (18:30 -0400)]
Fix benchutil alarm open to use #alarm

Fixes: 7fb6e2110113 ("Modify userspace to use device names [2/3]")

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
4 years agoTrack startup kthreads as ktasks
Barret Rhoden [Mon, 28 Sep 2015 22:11:27 +0000 (18:11 -0400)]
Track startup kthreads as ktasks

The ktask flag marks whether or not we are a kernel task or a
user-backing kthread (meaning we're executing a syscall or the user is
running).  Parts of the codebase assume you are one or the other, as in
rendez_sleep().  That's probably a correct assumption.

If you call kthread_usleep() before smp_idle(), we'd get in a situation
where we had a kthread (the startup kthread) that was not a ktask (prior
to this commit) and the code that handles aborting syscalls would think
there should be a user syscall associated with the kthread.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>