akaros.git
3 years agoRemove cpu_feats from kernel-features.h (XCC)
Barret Rhoden [Fri, 11 Mar 2016 14:55:21 +0000 (09:55 -0500)]
Remove cpu_feats from kernel-features.h (XCC)

It turns out that glibc doesn't need its own copy of the cpu_feats, and it
can just include parlib's.  It may be that some code in glibc won't be
able to include parlib files.  If that's the case, and those files need
cpu_feats, then we can revisit this.

This popped up as a problem when a file in glibc included both
kernel-features. and parlib/cpu_feat.h.

Rebuild glibc if you want.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemoving extra run_vmthread calls.
GanShun [Thu, 10 Mar 2016 21:39:15 +0000 (13:39 -0800)]
Removing extra run_vmthread calls.

These calls should only be made at the bottom of the while loop, otherwise
we run the risk of missing a vmexit.

Signed-off-by: GanShun <ganshun@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoReturn real vendor/part id in query_device
Kanoj Sarcar [Thu, 10 Mar 2016 19:24:57 +0000 (11:24 -0800)]
Return real vendor/part id in query_device

ofed_perftest tool cares about vendor/part id.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up logic in MSR read/write functions.
Dan Cross [Tue, 8 Mar 2016 19:48:59 +0000 (14:48 -0500)]
Clean up logic in MSR read/write functions.

These code paths could be cleaned up and a level of indentation removed.
Also, remove the use of atomic types as they are unneeded in this case.

Signed-off-by: Dan Cross <crossd@gmail.com>
[minor git-fu]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdded comment to note that fninit clears FOP
Michael Taufen [Sat, 5 Mar 2016 00:27:20 +0000 (16:27 -0800)]
Added comment to note that fninit clears FOP

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFP save/restore security patch for AMD processors
Michael Taufen [Sat, 5 Mar 2016 00:25:44 +0000 (16:25 -0800)]
FP save/restore security patch for AMD processors

AMD processors do not save/restore the FOP/FIP/FDP values from/to the
x87 FPU unless an unmasked FPU exception is pending. This can result in
a state leak between processes during a context switch, and is a
potential security hole.

See CVE-2006-1056 and CVE-2013-2076 on cve.mitre.org.

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoExtended state AMD backwards compatibility updates (XCC)
Michael Taufen [Thu, 3 Mar 2016 21:32:05 +0000 (13:32 -0800)]
Extended state AMD backwards compatibility updates (XCC)

Rebuild your universe (kernel headers and user apps)!

These updates allow Akaros to defer to FXSAVE instructions in the event
that the processor does not support the XSAVE instructions. This is
necessary for Akaros to run on older AMD processors (pre bulldozer).

Akaros will still refuse to boot if you do not have support for FXSAVE.

These updates also include additional CPU feature detection,
particularly x86 vendor detection and support for the XSAVE instruction.

Finally, these updates allow the use of XSAVE in the absence of
XSAVEOPT, because it was an easy patch and we don't have to be that
mean.

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdded vmrunkernel option for extending the kernel command line passed to the guest
Michael Taufen [Sat, 27 Feb 2016 00:03:11 +0000 (16:03 -0800)]
Added vmrunkernel option for extending the kernel command line passed to the guest

vmrunkernel now targets the launcher program in our linux fork's initramfs
instead of init (see rminnich/linux and mtaufen/ak-vm-tests)

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up IPv6 sources.
Dan Cross [Wed, 9 Mar 2016 16:30:10 +0000 (11:30 -0500)]
Clean up IPv6 sources.

I'm diving into IPv6 code to get it working. These are trivial
cleanups that I don't want to obscure potential future changes
that would be more substantive.

Remove redundant or unused headers, whitespace cleanups, etc.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoARRAY_SIZE is the standard in the kernel.
Dan Cross [Tue, 8 Mar 2016 20:38:15 +0000 (15:38 -0500)]
ARRAY_SIZE is the standard in the kernel.

Trivial change to follow the convention used elsehwere in the
kernel.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up profiler configure and usage functions.
Dan Cross [Tue, 8 Mar 2016 16:54:42 +0000 (11:54 -0500)]
Clean up profiler configure and usage functions.

An incidental cleanup that became evident from the last cleanup;
the 'profiler_configure' function was unnecessarily hard to
follow due to lack of an early return.

Also, there was this odd function to return an array of strings
that could be used to construct an error message, but that were
used nowhere else; this was an encapsulation failure.  Change
that to just construct the error message and call it.

Arguably, the configure function should just call 'error()'. Oh
well.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up profiler variables and formatting.
Dan Cross [Tue, 8 Mar 2016 16:39:46 +0000 (11:39 -0500)]
Clean up profiler variables and formatting.

Remove unused variables, move loop indices to their loop,
use void* instead of char* in several places, clean up
declaration formatting.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix formatting: leading spaces to tabs, and fix continued-line alignment.
Dan Cross [Tue, 8 Mar 2016 15:37:27 +0000 (10:37 -0500)]
Fix formatting: leading spaces to tabs, and fix continued-line alignment.

Indent using tabs, not spaces.

In the event that a line must be broken due to length, the coding
standard says to break it so that we use tabs to advance the
continued line to the level of indentation of the broken line,
and then spaces to align to the opening parenthesis.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd in more uverbs backward compatilibity
Kanoj Sarcar [Tue, 8 Mar 2016 00:33:32 +0000 (16:33 -0800)]
Add in more uverbs backward compatilibity

Add in support for older style extended query device.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd rdmsr and wrmsr utilities
Barret Rhoden [Mon, 7 Mar 2016 19:07:17 +0000 (14:07 -0500)]
Add rdmsr and wrmsr utilities

Note that wrmsr writes the same MSR value to *all* cores.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a helper for querying the number of cores
Barret Rhoden [Mon, 7 Mar 2016 19:04:04 +0000 (14:04 -0500)]
Add a helper for querying the number of cores

The info is exposed via #vars.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove MAX_VCORES
Barret Rhoden [Mon, 7 Mar 2016 19:23:11 +0000 (14:23 -0500)]
Remove MAX_VCORES

This was limiting us to 64 vcores.  Instead of cranking the number up, I
opted to just remove the #define completely.  We should be able to figure
these things out dynamically.

Right now MAX_NUM_CORES is 256 for x86.  That was due to the old xAPIC.
One of these days we'll actually want to run on a large-scale SMP machine
and will want to increase that.  And then we'll also start worrying about
the size of things that grow O(MAX_NUM_CORES) for every process, e.g.
procdata.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove MCS dissemination barrier
Barret Rhoden [Mon, 7 Mar 2016 19:20:12 +0000 (14:20 -0500)]
Remove MCS dissemination barrier

It's a cool thing, but it has a few problems.
- It wants to know statically how many vcores there are (max, at least).
- It doesn't pad its dissem structure properly (it adds 64 bytes extra, not
  of padding, but just an array).
- It doesn't handle preemption.

All of these can be fixed, if we actually want the barriers.  In that case,
we can bring this code back and fix up the above three things.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: Fix devarch's MSR error handling
Barret Rhoden [Mon, 7 Mar 2016 18:58:59 +0000 (13:58 -0500)]
x86: Fix devarch's MSR error handling

In some cases, we weren't even setting errno, just returning -1.  Then on
error, we'd get crap from perror() like:

pread: Success

Now get meaningful errstrs and at least have errno set.

E.g.

(On a machine without IA32_PERF_CTL)
/ $ rdmsr 0x199
pread: Bad address, read_msr() faulted on MSR 0x199
/ $ wrmsr 0x198 88888
pwrite: Operation not permitted, MSR 0x198 not in write whitelist

Most of the other errors would be triggered by a rdmsr or wrmsr bugging
out.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: Properly initialize MSR whitelists
Barret Rhoden [Mon, 7 Mar 2016 18:56:37 +0000 (13:56 -0500)]
x86: Properly initialize MSR whitelists

The address ranges need to be initialized so that they are sorted.
Otherwise, whoever adds entries needs to know the actual value of the MSRs
and maintain their ordering manually.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: Use FSGSBASE for TLS changes (XCC)
Barret Rhoden [Mon, 29 Feb 2016 23:34:45 +0000 (18:34 -0500)]
x86: Use FSGSBASE for TLS changes (XCC)

When the CPU feature is available, userspace and the kernel will use the
instructions (e.g. wrfsbase) to change TLS.

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: use setters/getters for MSR_{FS,GS}_BASE
Barret Rhoden [Mon, 29 Feb 2016 23:24:40 +0000 (18:24 -0500)]
x86: use setters/getters for MSR_{FS,GS}_BASE

We need to be a little careful in the kernel with using these before cr4 is
set.  We'll eventually set cr4 to enable this usage in arch_pcpu_init.  For
the most part, any MSR accesses of this sort will happen after smp_boot,
which is fine.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: Detect XSAVEOPT
Barret Rhoden [Mon, 29 Feb 2016 20:49:24 +0000 (15:49 -0500)]
x86: Detect XSAVEOPT

This is an  examples of how the kernel can set and query CPU features.  For
the most part, we should do all of the cpu_set_feat() very early during
boot in cpuinfo.

XSAVEOPT implies XSAVE, so we have just CPU_FEAT_X86_XSAVEOPT.

With these changes, both the user and the kernel can check at runtime for
XSAVEOPT and adapt accordingly.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd CPU feature detection (XCC)
Barret Rhoden [Mon, 29 Feb 2016 20:45:13 +0000 (15:45 -0500)]
Add CPU feature detection (XCC)

Userspace, Glibc, and the kernel can now query whether the CPU has certain
features with

bool cpu_has_feat(int feature);

Some CPU features are architecture independent, such as the support for
virtual machines.  Most others will be architecture dependent.  I added a
few feature bits as an example, though they are not used yet.

To use within the kernel:

#include <cpu_feat.h>

To use within glibc:

#include <kernel-features.h>

To use in generic userspace (e.g. user/*, tests/*, etc):

#include <parlib/cpu_feat.h>

Reinstall your kernel headers to use the features.  Rebuild glibc to make
sure I didn't mess anything up.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd proc_global_info (XCC)
Barret Rhoden [Mon, 29 Feb 2016 18:38:35 +0000 (13:38 -0500)]
Add proc_global_info (XCC)

This is a read-only, shared-memory region mapped into every process's
address space.

Rebuild the world.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix mxcsr boot time init
Michael Taufen [Mon, 29 Feb 2016 16:57:53 +0000 (08:57 -0800)]
Fix mxcsr boot time init

The mxcsr register should be initialized to its power on default of 0x1f80.
This masks all SIMD floating point exceptions and clears all SIMD
floating-point exception flags, sets rounding control to round-nearest
disables flush-to-zero mode, and disables denormals-are-zero mode.

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
[ removed a couple extra newlines ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVirtualization changes to handle X2APIC mode.
GanShun [Thu, 17 Dec 2015 22:43:30 +0000 (14:43 -0800)]
Virtualization changes to handle X2APIC mode.

These are changes to the vmm to allow it to handle the new MSR based
accesses. This includes allowing the direct msr access in vmx.c,
otherwise vmexiting will occur.

Signed-off-by: GanShun <ganshun@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoEnabling X2APIC
GanShun [Thu, 17 Dec 2015 01:36:39 +0000 (17:36 -0800)]
Enabling X2APIC

Changing all offsets from the old XAPIC mode to the newer X2APIC mode and
removing lapic_wait_to_send. All interaction with the X2APIC is done with
apicrput, apicrget or apicsendipi. Removed memory allocation in pmap64.c
and value check in check_sym_val

Signed-off-by: GanShun <ganshun@gmail.com>
[ removed some debugging comments, fixed pb_ktest ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemoved lapic_set_id and lapic_set_logid functions
GanShun [Wed, 16 Dec 2015 20:21:09 +0000 (12:21 -0800)]
Removed lapic_set_id and lapic_set_logid functions

These functions are not used and are no longer allowed once we swap to the
X2APIC. Removing them in preparation for activating the X2APIC

Signed-off-by: GanShun <ganshun@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agofp state save, restore, and error handling
Michael Taufen [Mon, 22 Feb 2016 22:55:52 +0000 (14:55 -0800)]
fp state save, restore, and error handling

save_fp_state and restore_fp_state now use xsaveopt64 and xrstor64,
restore_fp_state handles faults. In the event of a fault,
restore_fp_state prints an error message and then restores
the fp state to a default that was determined at boot.

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agovm exit handler for xsetbv
Michael Taufen [Mon, 22 Feb 2016 22:47:54 +0000 (14:47 -0800)]
vm exit handler for xsetbv

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoInitialize guest xcr0, save and restore xcr0 between guest and Akaros
Michael Taufen [Mon, 22 Feb 2016 22:42:31 +0000 (14:42 -0800)]
Initialize guest xcr0, save and restore xcr0 between guest and Akaros

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoBoot time and per-cpu extended state setup
Michael Taufen [Mon, 22 Feb 2016 22:35:42 +0000 (14:35 -0800)]
Boot time and per-cpu extended state setup

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd load, safe load, read xcr0 functions
Michael Taufen [Mon, 22 Feb 2016 22:27:02 +0000 (14:27 -0800)]
Add load, safe load, read xcr0 functions

void lxcr0(uint64_t xcr0)
int safe_lxcr0(uint64_t xcr0)
uint64_t rxcr0(void)

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRelocated fixup table macros
Michael Taufen [Wed, 24 Feb 2016 23:15:48 +0000 (15:15 -0800)]
Relocated fixup table macros

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoExtended state data structures (XCC)
Michael Taufen [Mon, 22 Feb 2016 22:11:37 +0000 (14:11 -0800)]
Extended state data structures (XCC)

Rebuild your kenrel headers and rebuild all user apps!

new ancillary_state state components
x86_default_xcr0
xcr0 in guest_pcore

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove some trailing whitespace.
Michael Taufen [Thu, 11 Feb 2016 17:50:47 +0000 (09:50 -0800)]
Remove some trailing whitespace.

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoTurn off TSD in slave processors.
Kanoj Sarcar [Wed, 24 Feb 2016 19:08:13 +0000 (14:08 -0500)]
Turn off TSD in slave processors.

Turn off TSD (Time Stamp Disable) on slaves.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd page reference counting to mm hooks.
Kanoj Sarcar [Mon, 22 Feb 2016 23:42:20 +0000 (15:42 -0800)]
Add page reference counting to mm hooks.

Add page reference counting logic to some of the user map helper functions.

Expose one of the mlx4 parameters to user space.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoMake query_port not report port_down always.
Kanoj Sarcar [Mon, 22 Feb 2016 23:35:17 +0000 (15:35 -0800)]
Make query_port not report port_down always.

Hack existing linux logic to avoid netdev stuff that was reporting port_down
always.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix couple of problems in compat code.
Kanoj Sarcar' via Akaros [Thu, 18 Feb 2016 22:11:00 +0000 (14:11 -0800)]
Fix couple of problems in compat code.

While trying newer tests, the non-initialization logic of SGL's became
apparent. Also, newer tests invoke get_user_pages() without faulting in
corresponding pages, so we need to automatically allocate the pages.

Clean up to do reference counting in get_user_pages() etc will come later.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove user include hacks
Barret Rhoden [Tue, 16 Feb 2016 21:15:34 +0000 (16:15 -0500)]
Remove user include hacks

Due to the old style of having user libraries include their own headers as
both <libname/foo.h> and <foo.h>, we had to have a few hacks to force us to
include the 'real' headers that we wanted.

Now that we do things the right way, we don't need to carry those hacks
around.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoClean up user library include paths (XCC)
Barret Rhoden [Tue, 16 Feb 2016 19:33:42 +0000 (14:33 -0500)]
Clean up user library include paths (XCC)

Allowing libraries to search their own include/ for <foo.h> is a huge mess
that results in issues when glibc has foo.h.  The fix is to not allow that,
and to insist libraries refer to their own files by their full name
(libname/foo.h).

All user libraries (other than pthread) now have their include directories
arranged as:

user/LIBNAME/include/LIBNAME/FOO.h

With their include path being set to user/LIBNAME/include/, and all
 #includes explicitly list the libname.

Due to moving parlib's arch symlink, you'll need to do something like:

$ rm user/parlib/include/arch
$ make mrproper
$ mv .config.old .config
$ make ARCH=x86 oldconfig
$ make userclean

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoActivate kernel bypass logic
Kanoj Sarcar' via Akaros [Thu, 11 Feb 2016 01:11:53 +0000 (17:11 -0800)]
Activate kernel bypass logic

Hook in mlx4/ driver to activate kernel bypass logic.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoPort over linux 4.1.15 infiniband/core logic for kernel bypass NIC access
Kanoj Sarcar' via Akaros [Thu, 11 Feb 2016 01:09:47 +0000 (17:09 -0800)]
Port over linux 4.1.15 infiniband/core logic for kernel bypass NIC access

Port over linux 4.1.15 drivers/infiniband/core logic essential for
kernel bypass NIC access. Slight edits to adapt to Akaros environment
(#if exclusion of non essential code blocks, panic stubs etc), described
in README file.

Most of the interlock logic with core kernel (mm/vfs etc) is captured
in compat.[ch].

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoPort over linux 4.1.15 mlx4 kernel bypass driver
Kanoj Sarcar' via Akaros [Wed, 10 Feb 2016 23:54:30 +0000 (15:54 -0800)]
Port over linux 4.1.15 mlx4 kernel bypass driver

Port over linux 4.1.15 drivers/infiniband/hw/mlx4 logic essential for
kernel bypass NIC access. Slight edits to adapt to Akaros environment
(#if exclusion of non essential code blocks, panic stubs etc), described
in README file.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUpdates from vmm-akaros
Michael Taufen [Wed, 10 Feb 2016 17:37:58 +0000 (09:37 -0800)]
Updates from vmm-akaros

Boot params
e820 info
Use copy_vmctl_tovmtf(*) in __build_vm_ctx_cp(*)
Inject GPF on unsupported MSR access
Add linux_bootparam.h

Signed-off-by: Michael Taufen <mtaufen@gmail.com>
[ pragma once, static_assert->parlib_static_assert ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove kernel errno string processing
Barret Rhoden [Sat, 13 Feb 2016 20:57:08 +0000 (15:57 -0500)]
Remove kernel errno string processing

The kernel doesn't really need to know about the string names for errno
values.  We were using that mostly as a hack to not use proper
errstrings.

I kept parse_errno.sh around, since we (theoretically) still use that to
generate error lists in glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove uses of errno_to_string()
Barret Rhoden [Sat, 13 Feb 2016 19:03:57 +0000 (14:03 -0500)]
Remove uses of errno_to_string()

Using errno_to_string() was a hack.

In addition to removing that, this commit cleans up a few nasty things.
In namec(), we just had a static string floating around for some reason.
Good times.

More importantly, in sysfile we were doing a brain-dead strcmp on
ENODATA.  Computers should do comparisons on errno.  Errstr is for
humans.  The danger there is that if someone did:

error(ENODATA, "Actually a useful message that was not NULL")

then the strcmp on errstr would fail, since it's not the "string that
meant ENODATA).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoOutlaw the setting of NULL errstrs
Barret Rhoden [Sat, 13 Feb 2016 19:12:01 +0000 (14:12 -0500)]
Outlaw the setting of NULL errstrs

This will catch them if we try to use them.  O/w we'll have to rely on
other methods (code review/tools) to find them.

Maybe there's an argument to be made for a simple error(EFOO, 0),
where you just don't want to bother making a string.  Then for now you
can use ERROR_FIXME.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove the printk format %e
Barret Rhoden [Sat, 13 Feb 2016 19:00:35 +0000 (14:00 -0500)]
Remove the printk format %e

We were using it in two places, and one of them was incorrect (getchar()
wasn't returning an errno).

It was also blindly inferring errstr() from the context.  If we're going to
do that, we might as well not even pass in the err.  It was basically an
unused perror() for the kernel.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoqio: Use an empty string to mark a closed queue
Barret Rhoden [Sat, 13 Feb 2016 17:48:14 +0000 (12:48 -0500)]
qio: Use an empty string to mark a closed queue

Instead of converting ECONNABORTED to a string and then doing a
brain-dead strcmp, we can just use an empty string.  AFAIK, qclose() or
hangups with no message are meant to be normal closes.  Reads will just
return 0 (no data, EOF, etc.), instead of throwing an error.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoHave #ip protocol's bind()s throw errors
Barret Rhoden [Sat, 13 Feb 2016 17:21:16 +0000 (12:21 -0500)]
Have #ip protocol's bind()s throw errors

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoHave #ip protocol's announce()s throw errors
Barret Rhoden [Sat, 13 Feb 2016 17:16:20 +0000 (12:16 -0500)]
Have #ip protocol's announce()s throw errors

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoHave #ip protocol's connect()s throw errors
Barret Rhoden [Sat, 13 Feb 2016 17:06:23 +0000 (12:06 -0500)]
Have #ip protocol's connect()s throw errors

A couple extra things:

Fsstdbind and Fsstdannounce have temporary waserror shims, until those
get changed.  I needed to change setladdrport to throw, and these needed
to catch it for now.

udpconnect() was the only connect method to call Fsconnected() even if
Fsstdconnect() failed.  All of the others just return immediately.  A
potential effect of Fsconnected() is that it calls rendez_wake on the
"connection rendez" in the conversation (c->cr).  It might be the case
that someone could be sleeping, and we fail to wake them up.  Perhaps a
connect or announce succeeded, but then we failed with this connect, and
now we don't wake anyone.

Given that udpannounce is structured the same as the other connects
(doesn't call Fsconnected() on error), and the qlocks on the CV for
connect and announce, I think this can't happen.  And if it does, then
we just need to fix this connect/announce mess to make it not happen
globally.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoHave #ip's protocol ctl()s throw errors
Barret Rhoden [Sat, 13 Feb 2016 16:29:03 +0000 (11:29 -0500)]
Have #ip's protocol ctl()s throw errors

I took care of UDP and ICMP6 in this commit, since they were fairly
simple.  TCP and IPIFC were a bit more complex.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoThrow errors from within ipifcctl()
Barret Rhoden [Sat, 13 Feb 2016 16:16:34 +0000 (11:16 -0500)]
Throw errors from within ipifcctl()

This commit changes the internals of ipifcctl() to use error().  Note
that this does not change ipifcconnect() (yet), though it mucks with
that functions slightly due to a common helper.

I also added a helper for ipifc_iprouting, instead of performing the
"iprouting" operation in-line.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoThrow errors from within tcpctl()
Barret Rhoden [Fri, 12 Feb 2016 22:24:17 +0000 (17:24 -0500)]
Throw errors from within tcpctl()

The #ip ctl message framework expects to get a string for an error, then
it just calls error() with that string.  We even go so far as to catch the
error, then return the current_errstr().  All a bit ridiculous.

Changing TCP's ctl message is the first in a few steps to change #ip's ctl
interface.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove bootp
Barret Rhoden [Fri, 12 Feb 2016 22:48:22 +0000 (17:48 -0500)]
Remove bootp

This was just a source of null pointer faults in the kernel.  bootp was a
function pointer, set to null.  You can crash the kernel with:

/ $ echo "bootp" > /net/ipifc/0/ctl

As Ron aptly put it: "bootp is dead to me."

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoBuild the kernel with -Werror
Barret Rhoden [Sat, 13 Feb 2016 17:31:32 +0000 (12:31 -0500)]
Build the kernel with -Werror

For people doing development, they can turn it off locally (i.e. not
committed) in the Makefile.  But by default, all builds need to have
-Werror.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove the static_link_warning from glibc (XCC)
Barret Rhoden [Fri, 12 Feb 2016 20:02:52 +0000 (15:02 -0500)]
Remove the static_link_warning from glibc (XCC)

If you build a program statically and use dlopen (or one of a few other
functions), you'll get a warning.  The warning is meant to tell you that
even though you built statically, you still need to have the right .so's
for things to work at runtime.  That's fine.

The problem is that, while this information is nice, it is a warning, and
not just informational, and it breaks with -Werror.

Rebuild your toolchain.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agomlx4: Port over Linux header files related to NIC kernel bypass
Kanoj Sarcar' via Akaros [Wed, 10 Feb 2016 23:31:00 +0000 (15:31 -0800)]
mlx4: Port over Linux header files related to NIC kernel bypass

Slight edits on top of Linux 4.1.15 header files for Akaros builds.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
[fixed a checkpatch warning]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoEnable build info rebuild upon HEAD commit ID change
Davide Libenzi [Mon, 21 Dec 2015 19:24:36 +0000 (11:24 -0800)]
Enable build info rebuild upon HEAD commit ID change

Enable build info rebuild upon HEAD commit ID change.

Signed-off-by: Davide Libenzi <dlibenzi@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd QEMU port forwarding options for UDP as well as TCP
Edward Hyunkoo Jee [Thu, 11 Feb 2016 01:26:13 +0000 (17:26 -0800)]
Add QEMU port forwarding options for UDP as well as TCP

Signed-off-by: Edward Hyunkoo Jee <edjee@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove #nix
Barret Rhoden [Wed, 10 Feb 2016 23:22:44 +0000 (18:22 -0500)]
Remove #nix

That was an old experiment that we haven't touched in over a year and will
probably never use again.  If we do, we can resurrect it.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Fix missed posted IRQs
Barret Rhoden [Wed, 10 Feb 2016 19:22:52 +0000 (14:22 -0500)]
VMM: Fix missed posted IRQs

There's a couple parts to it:

- vmrunkernel was not posting the IRQ properly; it wasn't setting the
  outstanding notification bit.

- We need to self_ipi when that bit is set.  We had previously lost a race
  when poking the guest pcore (IPI was sent, but not received while in a
VM).  We just resend the IPI.

For ease of access, I now store the posted_irq_desc in the GPC.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Add a syscall to poke a guest pcore (XCC)
Barret Rhoden [Mon, 8 Feb 2016 17:29:27 +0000 (12:29 -0500)]
VMM: Add a syscall to poke a guest pcore (XCC)

Posted IRQs in VMX are a lot like poking the guest pcore, so we'll just use
a syscall for it.

There's a bit of nastiness with error handling.  So far, it's a real pain
to find out if a posted IRQ landed on the VM and handling if it didn't.
(When the POKE IRQ lands and the core wasn't a VM, how do we know for
certain which VM we were supposed to interrupt, without doing something
painful?).

The general Akaros philosophy here is to post a bit in memory and poke
spuriously.  When it comes to notifying vcores, we set notif_pending, send
a (possibly spurious) __notify, and if we missed it, we'll see the
notif_pending the next time we __startcore.  Hopefully we can do something
similar with posted IRQs.

This also cleans up all of the vmctl hacks, none of which are needed
anymore.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRename SYS_setup_vmm -> SYS_vmm_setup (XCC)
Barret Rhoden [Fri, 5 Feb 2016 21:39:26 +0000 (16:39 -0500)]
Rename SYS_setup_vmm -> SYS_vmm_setup (XCC)

This will make having multiple VMM syscalls slightly cleaner.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Use the I_POKE_CORE IRQ for posted IRQs
Barret Rhoden [Fri, 5 Feb 2016 19:19:48 +0000 (14:19 -0500)]
VMM: Use the I_POKE_CORE IRQ for posted IRQs

Posting IRQs is a lot like poking a core, especially when you try and think
about it in an architecture independent manner.  I_POKE_CORE is normally a
way to make sure a core isn't halted.  For VMs, it's now also a way to poke
the VMX hardware to inject an IRQ if necessary.

Note that if we want to post an IRQ and send an I_POKE_CORE, but the VM
exits before the IRQ gets there, we just run the regular POKE_ handler,
which does nothing.

This also makes our life a little easier, in that handle_vmexit_ext_irq()
doesn't need to think about getting an I_POKE.  Poke's aren't full-up IRQs
(requiring a registered handler) regardless of whether they are sent to a
core running a VM or a 'regular' core.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Remove unused tests
Barret Rhoden [Fri, 5 Feb 2016 21:38:54 +0000 (16:38 -0500)]
VMM: Remove unused tests

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agomlx4: Enable QP destruction
Kanoj Sarcar' via Akaros [Wed, 10 Feb 2016 23:18:17 +0000 (15:18 -0800)]
mlx4: Enable QP destruction

QP destruction panics because radix_tree_delete() is panic-stubbed. Implement
a version using linked lists that allows deletion.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
[ tagged commit with mlx4: ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAllow CQ destruction in core mlx4/ logic.
Kanoj Sarcar' via Akaros [Wed, 10 Feb 2016 23:10:21 +0000 (15:10 -0800)]
Allow CQ destruction in core mlx4/ logic.

Enable CQ destruction.

Signed-off-by: Kanoj Sarcar <kanoj@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoACPI changes for DMAR and new directory hierarchy.
Dan Cross [Fri, 5 Feb 2016 20:36:13 +0000 (15:36 -0500)]
ACPI changes for DMAR and new directory hierarchy.

Add the DMAR parser, and rationalize the ACPI directory
hierarchy to make it traversable from the shell. There
is additional work to do here on the latter, but that is
not critical path and this gets Gan the DMAR code he
needs for virtualization.

Signed-off-by: Dan Cross <crossd@gmail.com>
[ Fixed 16 -> KMALLOC_ALIGNMENT ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoSet up root home for ssh
Ronald G. Minnich [Tue, 9 Feb 2016 20:59:25 +0000 (12:59 -0800)]
Set up root home for ssh

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoBind #random to /dev during ifconfig
Barret Rhoden [Fri, 5 Feb 2016 15:29:34 +0000 (10:29 -0500)]
Bind #random to /dev during ifconfig

Most programs expect to find random/urandom there.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse the new RNG for the networking stack
Barret Rhoden [Thu, 4 Feb 2016 23:35:41 +0000 (18:35 -0500)]
Use the new RNG for the networking stack

And remove the old version.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoUse random_read() for small urandom_read() calls
Barret Rhoden [Thu, 4 Feb 2016 23:33:17 +0000 (18:33 -0500)]
Use random_read() for small urandom_read() calls

urandom_read() starts with a random 8 byte seed.  If we need less than
that, we can (probably) just go with those bytes, as if we called
random_read() directly.

This also exposes random_read() and urandom_read() to external callers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove major stack consumers from procread()
Barret Rhoden [Thu, 4 Feb 2016 22:36:57 +0000 (17:36 -0500)]
Remove major stack consumers from procread()

procread() is a disaster.  This makes it less of a disaster.  We were
running off the end of the kernel stack when running with 4K stacks, due to
the 1K or so procread needed.  We're already deep in 9ns, and that was
enough to clobber memory.

I also removed all of the functionality that was #if 0'd out.  If we ever
need it, we can recreate it or get it from Plan 9 / Harvey.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix strace flow control and data extraction issues
Barret Rhoden [Thu, 4 Feb 2016 18:34:11 +0000 (13:34 -0500)]
Fix strace flow control and data extraction issues

There were a few issues.

- Using qwrite() was causing us to block at a bad time.  Using qiwrite()
  won't block, but it also won't do any flow control.  Instead, we do our
own flow control.  Note that we only check on the trace before the syscall.
If we have a start entry, we should have an exit entry, regardless of the
queue size.

- When we procread(), we were grabbing a reference on the proc.  However,
  if the process already exited, then we'd fail.  This prevented us from
draining the queue after the hangup.

- We drop traces.  Now we report the numbers.  If your console is slow, try
  redirecting to a file (e.g. strace foo 2> trace_file).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoChange qfull() to check limits
Barret Rhoden [Thu, 4 Feb 2016 17:35:23 +0000 (12:35 -0500)]
Change qfull() to check limits

Instead of checking Qflow flags, we just check the len vs the limit.  This
way, we can use qfull() when we're filling a queue with qiwrite/qibwrite
(which don't bother setting Qflow).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix page faults in strace
Barret Rhoden [Thu, 4 Feb 2016 03:00:24 +0000 (22:00 -0500)]
Fix page faults in strace

There are a couple issues, one of which we had.  For one, if the tracer
turned on halfway through a syscall, then kthread->strace might have
been garbage.  Probably not, but better safe than sorry.

The other issue is that the syscall struct could be gone by the time the
syscall ends and we finish the trace.  This could happen with
sys_exec().  We clear kthread->sysc, since the memory it points to was
already freed during the exec.

sctrace() will get cleaned up a little once we merge it with the
original tracer.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix warnings due to the declaration of serialize_*
Barret Rhoden [Thu, 4 Feb 2016 02:46:52 +0000 (21:46 -0500)]
Fix warnings due to the declaration of serialize_*

We missed these in commit 1fe2ed47726f ("Fix parameter types for
sys_proc_create() (XCC)").

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a random device; remove old genrandom junk; remove random from #cons
Ronald G. Minnich [Thu, 4 Feb 2016 03:09:02 +0000 (19:09 -0800)]
Add a random device; remove old genrandom junk; remove random from #cons

Compiles just fine.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoGet the basic random number generator functions to compile
Ronald G. Minnich [Thu, 4 Feb 2016 02:48:37 +0000 (18:48 -0800)]
Get the basic random number generator functions to compile

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFormat with .clang-format, included herein for reference
Ronald G. Minnich [Thu, 4 Feb 2016 02:22:18 +0000 (18:22 -0800)]
Format with .clang-format, included herein for reference

BasedOnStyle: LLVM
IndentWidth: 4
TabWidth: 4
UseTab: Always
BreakBeforeBraces: Linux
AllowShortIfStatementsOnASingleLine: false
IndentCaseLabels: false
AlignAfterOpenBracket: true

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoImport new random number generator files from harvey
Ronald G. Minnich [Thu, 4 Feb 2016 01:52:35 +0000 (17:52 -0800)]
Import new random number generator files from harvey

This is the unchanged version; they will not compile at all
and need formatting.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a .clang-format
Ronald G. Minnich [Thu, 4 Feb 2016 01:53:52 +0000 (17:53 -0800)]
Add a .clang-format

This essentially follows the linux format save for tab indent of 4.

I've used this to format big blocks of code and it removes almost all
warnings from checkpatch, and the only ones left were
known to be bogus.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoNew and easy strace framework.
Ronald G. Minnich [Mon, 25 Jan 2016 22:26:22 +0000 (14:26 -0800)]
New and easy strace framework.

echo straceon > /proc/pid/strace
cat /proc/pid/strace

echo straceme if you don't want inheritance.
echo straceoff > /proc/pid/strace to stop it.

That's it. strace acts like a file.

and you'll see syscall info (enter and exit) for the process.

So strace is now spelled cat, dd, grep, or, well,
anything that reads files.

Inheritance is working.

But this is a very efficient way to trace processes, even better than
the tracer I wrote for Plan 9.
A single read from strace can read many system call records.

This now dumps read, write, and open information.

Included is a sample strace program which works. You can even
strace a shell now and watch the children.

Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[ various fixups, side by side with Ron! ]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix parameter types for sys_proc_create() (XCC)
Barret Rhoden [Wed, 3 Feb 2016 23:29:33 +0000 (18:29 -0500)]
Fix parameter types for sys_proc_create() (XCC)

There were too many consts in there.  The new ones are exactly from
execv()'s parameters.

Technically, you should rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoReplace old ether82563.c with new driver.
Dan Cross [Mon, 25 Jan 2016 23:44:14 +0000 (18:44 -0500)]
Replace old ether82563.c with new driver.

Replace the old ether82563.c driver with the new driver
taken from Plan 9, incorporating Geoff Collyer's recent work
on the i218.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoLatest ether82563.c: integrate into Akaros.
Dan Cross [Mon, 25 Jan 2016 19:46:25 +0000 (14:46 -0500)]
Latest ether82563.c: integrate into Akaros.

Get the latest ether82563 driver building and running on Akaros;
integrate Akaros-specific changes into the driver. Document further
work.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agolindent on ether82563.c
Dan Cross [Fri, 22 Jan 2016 22:12:11 +0000 (17:12 -0500)]
lindent on ether82563.c

Run 'lindent' on ether82563.c.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agospatch plan9-ether82563.c
Dan Cross [Fri, 22 Jan 2016 22:11:31 +0000 (17:11 -0500)]
spatch plan9-ether82563.c

Run 'spatch' on ether82563.c from Plan 9.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoImport latest ether82563.c from Plan 9
Dan Cross [Fri, 22 Jan 2016 22:05:59 +0000 (17:05 -0500)]
Import latest ether82563.c from Plan 9

Bring in the latest copy of the Intel ether82563 driver
from Plan 9. This is the unmodified driver, save for
ensuring the correct copyright statement is at the top
of the file.

Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Rename vmx_vcpu -> guest_pcore (XCC)
Barret Rhoden [Tue, 2 Feb 2016 17:40:09 +0000 (12:40 -0500)]
VMM: Rename vmx_vcpu -> guest_pcore (XCC)

"Virtual CPU" is a little too close to "Virtual Core", which is something
completely different.  "Guest pcore" seems a little clearer, given Akaros's
current naming conventions.  Let me know if you have a better name.

This also moves the vcpu / gpc from the kernel header, since userspace
doesn't need to know about it.

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Clean up VMX setup
Barret Rhoden [Tue, 2 Feb 2016 17:25:02 +0000 (12:25 -0500)]
VMM: Clean up VMX setup

The pcpui->vmxarea was only used to pass the vmx buffer from setup() to
enable().

We can just merge those two functions and put their guts into
intel_vmm_pcpu_init().

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Remove unused code (XCC)
Barret Rhoden [Tue, 2 Feb 2016 17:20:57 +0000 (12:20 -0500)]
VMM: Remove unused code (XCC)

This removes a lot of the KVM "bag on the side".

For whatever reason, struct vmx_vcpu is in a kernel header, so reinstall
your headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Use VM contexts [2/2]
Barret Rhoden [Tue, 2 Feb 2016 17:02:24 +0000 (12:02 -0500)]
VMM: Use VM contexts [2/2]

The bulk of this commit changes vmrunkernel to use VM contexts, but without
changing the overall structure of the program.  We now use two uthreads:
one for the VM and one for the controller (i.e. int main()).  They pass
control back and forth with a mutex (the ball).

The other changes are to actually use the vmexit_handler in the kernel
(HOST_RIP) and to just assume we are notifying GPC 0 (which is what we've
been doing).  IPI injection needs a little work.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoVMM: Add kernel support for VM contexts [1/2]
Barret Rhoden [Tue, 2 Feb 2016 16:58:24 +0000 (11:58 -0500)]
VMM: Add kernel support for VM contexts [1/2]

The kernel now knows how to pop VM contexts and handle VM exits.

As of this commit, we're still using the old KVM loop.  The HOST_RIP on
resume is still set to use the old KVM loop, IPI injection still uses the
vmctl, and userspace does not ask it to use contexts.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agox86: Factor out irq_dispatch() from handle_irq()
Barret Rhoden [Tue, 2 Feb 2016 16:54:59 +0000 (11:54 -0500)]
x86: Factor out irq_dispatch() from handle_irq()

I'll need to call irq_dispatch() when VMs exit due to external interrupts.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>