akaros.git
2 years agovmap: Add a helper for global TLB shootdowns
Barret Rhoden [Sun, 27 Nov 2016 15:09:47 +0000 (10:09 -0500)]
vmap: Add a helper for global TLB shootdowns

The global TLB flush is an x86 thing, though other architectures might
have one too.  On x86, PTE_G means that the TLB entry won't be flushed
on a normal cr3 reload.  We use these for kernel mappings.  You have to
go through a couple extra hoops to flush those entries.

For us to dynamically change kernel virtual mappings, we'll need to
occasionally flush global TLB entries.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmap: Handle unaligned vaddrs on vunmap_vmem()
Barret Rhoden [Sat, 26 Nov 2016 22:03:20 +0000 (17:03 -0500)]
vmap: Handle unaligned vaddrs on vunmap_vmem()

vmap_pmem() allows the user to give us an unaligned paddr and it maps
enough pages to cover the requested region.  However, for the unmap, we
incorrectly thought we were given the vaddr of the overall mapping - not
the vaddr we returned to the caller (which was vaddr + PGOFF(paddr).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAllocate natural alignment with get_cont_pages()
Barret Rhoden [Wed, 23 Nov 2016 17:08:41 +0000 (12:08 -0500)]
Allocate natural alignment with get_cont_pages()

Linux code, notoriously the bnx2x driver, occasionally needs naturally
aligned contiguous page allocations.  Since the only code using
get_cont_pages() is Linux code, we can just use xalloc and get the
alignment they want.  Note that xalloc() is less efficient than the regular
allocations, due mostly to bypassing the arena's qcaches.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoConvert calls of get_cont_pages() to kpages_alloc
Barret Rhoden [Wed, 23 Nov 2016 16:58:18 +0000 (11:58 -0500)]
Convert calls of get_cont_pages() to kpages_alloc

get_cont_pages() and its 'order' interface are a mess.  Just ask for how
much memory you want, and let the allocator figure it out.

Also, get_cont_pages_node() was just pretending to do something
NUMA-related.  Remove it for now, and we can 'do the right thing' when we
figure it all out.

Linux code can still use get_cont_pages().  The semantics of that will
change shortly.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Update the ctor/dtor interface
Barret Rhoden [Tue, 22 Nov 2016 21:44:17 +0000 (16:44 -0500)]
slab: Update the ctor/dtor interface

The priv (private) field is to support parameterized callbacks.  For
instance, you might have separate kmem_caches for different parts of a
driver.

The old 'size' field was useless, since the caller should know the size of
the object (if that's even useful).

ctor can fail, and it will respect the mem flags.  I have a couple ctors in
mind that could block, so they'll need to check MEM_WAIT/MEM_ATOMIC.

I moved the dtor out of free_to_slab since the ctor needs to call free
if it failed.  I also considered adding a batch dtor interface so we can
free a chunk of objects at once, which could amortize the overhead of
freeing.  For example, if there was an expensive operation that had to
be done after freeing any object (such as a TLB shootdown), then a batch
dtor would make sense.  It turns out that I don't need this for now, so
I opted to keep the vmem paper's API.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd #mem, for memory diagnostics
Barret Rhoden [Tue, 22 Nov 2016 20:57:35 +0000 (15:57 -0500)]
Add #mem, for memory diagnostics

ls \#mem for details.  Most people would be interested in #mem/free and
 #mem/kmemstat.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove old memory tests
Barret Rhoden [Tue, 22 Nov 2016 20:41:09 +0000 (15:41 -0500)]
Remove old memory tests

They are unused and just print a bunch of stuff that no one looks at.
Also, I want to remove the printing functions in lieu of better debugging
diagnostics.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoTracks arenas and slabs on tailqs
Barret Rhoden [Wed, 9 Nov 2016 16:12:52 +0000 (11:12 -0500)]
Tracks arenas and slabs on tailqs

Both lists are protected by the arenas_and_slabs qlock.  The all_arenas
list will be useful for #mem.  The all_kmem_caches might not be useful,
since all caches always have a source arena.  It's fine for diagnostics
for now.

The important thing is that the existence and importing links are all
managed by the same lock, so things like reclaim and #mem device ops can
happen without worrying about the read-mostly structure changing.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoarena: Connecting importers with sources
Barret Rhoden [Wed, 9 Nov 2016 15:44:44 +0000 (10:44 -0500)]
arena: Connecting importers with sources

This allows us to see all arenas and slabs that import resources from
one another.  Eventually we'll have a reclaim ktask for base_arena that
can trigger the various reclaim functions.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoPut the size in the name of kmalloc caches
Barret Rhoden [Wed, 9 Nov 2016 15:38:35 +0000 (10:38 -0500)]
Put the size in the name of kmalloc caches

This will give us better debugging output.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoMove assert in sem_down()
Barret Rhoden [Wed, 9 Nov 2016 15:37:37 +0000 (10:37 -0500)]
Move assert in sem_down()

This allows us to use qlocks before kthreads have been set up.  If we
actually will block on the qlock, then we'll still panic.  This won't
happen.  Overall, we can now use uncontested qlocks, which makes
bootstrapping a little easier.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Add the magazine and depot layers
Barret Rhoden [Mon, 7 Nov 2016 13:25:41 +0000 (08:25 -0500)]
slab: Add the magazine and depot layers

This is the per-cpu caching layer, which should increase scalability at
the cost of RAM.  The per-core 2x magazines aren't free, and the objects
in those magazines are harder to reclaim.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd spin_trylock_irqsave()
Barret Rhoden [Wed, 9 Nov 2016 02:02:21 +0000 (21:02 -0500)]
Add spin_trylock_irqsave()

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agox86: Pretend to be core 0 in smp_main()
Barret Rhoden [Mon, 7 Nov 2016 12:48:55 +0000 (07:48 -0500)]
x86: Pretend to be core 0 in smp_main()

This is the function that all non-core 0 cores call during boot.  They
need to get a kernel stack, among other things, that requires the memory
allocator.  The allocator, in general, will need to know a core id, but
core_id() isn't ready yet for other cores.  Since the entire machine is
single threaded at this point, we can pretend to be core 0.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Remove obj_size from struct kmem_slab
Barret Rhoden [Mon, 7 Nov 2016 00:38:05 +0000 (19:38 -0500)]
slab: Remove obj_size from struct kmem_slab

We actually can just look at the cache itself, which tracks the object
size already.  That object size technically was the unaligned object
size, but that is mostly useless.  If we want the requested, but not
actual, object size for diagnostics, we can add tracking for it.

Note that the size is passed to the ctor/dtor.  That'll go away soon
too; I don't recall if it was something we added, or if it was in the
original slab paper, but it's mostly useless.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Stop appending a uintptr_t to small objects
Barret Rhoden [Mon, 7 Nov 2016 00:20:13 +0000 (19:20 -0500)]
slab: Stop appending a uintptr_t to small objects

Previously, when objects were in the slab layer (i.e., on a list in the
slab), they were constructed.  Because of that, we needed to append a
uinptr_t to small objects to form a linked list so as to not use the
constructed object.

Now that constructing happens above the slab layer, we can use the
memory of the object itself for the linked list.  Other than being
simpler, this saves some space and avoids fragmentation.  (consider a
256 byte item - we were adding 8 bytes, which would make it not pack
neatly into a page).

Note that small objects are all 'pro-touch', so we're allowed to use the
memory of the resource we're allocating.  This has always been true.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Move ctors/dtors to the slab layer
Barret Rhoden [Sun, 6 Nov 2016 16:45:49 +0000 (11:45 -0500)]
slab: Move ctors/dtors to the slab layer

In the upcoming magazine code, objects in the slabs are unconstructed.
Right now, they'll be constructed on demand, but shortly they will be
sitting in the depot and in magazines.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove kmalloc caches above PGSIZE
Barret Rhoden [Thu, 3 Nov 2016 14:59:46 +0000 (10:59 -0400)]
Remove kmalloc caches above PGSIZE

Now that kpages_arena has qcaches, there's no need to have kmalloc
support caches of the same size.  We'll just call the memory allocator
directly.  Kmalloc still has its slab caches for sizes from [64, 2048].

Note that these sizes include the kmalloc_tag, which means that if you
ask for a power-of-two from kmalloc, internally it will ask for the next
higher power-of-two.  It has always been this way.  Eventually, I'd like
to get rid of the refcnt, so we can just use an arena directly and
ignore the alignment issues.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoarena: Use qcaches (slabs) in the arena
Barret Rhoden [Thu, 3 Nov 2016 14:46:18 +0000 (10:46 -0400)]
arena: Use qcaches (slabs) in the arena

Until we get reclaim working, once memory gets added to an arena's
qcache, it'll never be returned to the arena.  I'm not overly worried
about fragmentation, since we know the size of memory in the qcache is a
regularly-desired size.  (e.g. PGSIZE).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Bootstrap before setting up kpages_arena
Barret Rhoden [Wed, 2 Nov 2016 22:05:23 +0000 (18:05 -0400)]
slab: Bootstrap before setting up kpages_arena

Kpages will use kmem_cache, so we should at least run the init function
first.  While we're at it, we can statically initialize the lock and list.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Move the name into the kmem_cache
Barret Rhoden [Wed, 2 Nov 2016 20:44:26 +0000 (16:44 -0400)]
slab: Move the name into the kmem_cache

Instead of pointing at an arbitrary const string.  The arena qcaches will
want dynamic strings, and they can't kmalloc them.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Import resources from a source arena
Barret Rhoden [Wed, 2 Nov 2016 19:29:49 +0000 (15:29 -0400)]
slab: Import resources from a source arena

Previously, the implication was that all slabs pull from the general memory
allocator.  Now, they will pull from their source arenas.

Careful - for small object, pro-touch caches, the slab layer assumes that
the arena is a page-aligned, memory allocator.  I considered outlawing
small objects for caches with explicit sources, but we might have explicit
memory arenas in the future (e.g. one per NUMA domains).

Most slab caches will just use NULL for their source, which means a kpages
arena for generic kernel memory.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Support 'no-touch' caches
Barret Rhoden [Wed, 2 Nov 2016 16:54:44 +0000 (12:54 -0400)]
slab: Support 'no-touch' caches

Slab allocators that are 'no-touch' will not use their source arenas for
bookkeeping.  This means that we always use a bufctl to point to objects,
instead of appending a pointer to an object and making a small linked list.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Add an arena pointer to the interface
Barret Rhoden [Tue, 1 Nov 2016 21:13:19 +0000 (17:13 -0400)]
slab: Add an arena pointer to the interface

The pointer doesn't do anything yet; that'll come later.

The transformation outside slab files was done with:

@@
expression A;
expression B;
expression C;
expression D;
expression E;
expression F;
@@
-kmem_cache_create(A, B, C, D, E, F)
+kmem_cache_create(A, B, C, D, NULL, E, F)

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Use a hashtable when looking up bufctls
Barret Rhoden [Sun, 30 Oct 2016 23:59:53 +0000 (19:59 -0400)]
slab: Use a hashtable when looking up bufctls

It only took 7 years or so to take care of that TODO.  Note that this is
probably slower than the old approach, which was instant.  Similar to
the arena, we use hash_helper and roll our own hash table, especially
due to allocation issues when we grow the table.

Aside from being necessary for NO_TOUCH slabs, this will save a lot of
memory when it comes to fragmentation when we use slabs for arenas.
Consider kpages_arena, which will have qcaches of pages.  That will be
pulling from slabs of 1 - 8 pages.  The one-page slab allocator will
need to have obj_size = PGSIZE and align = PGSIZE.  If we don't use the
hash and instead have the uintptr_t blob per object, we'd need two pages
per one-page-object, essentially wasting half of our memory.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Bootstrap more kmem_caches
Barret Rhoden [Sun, 30 Oct 2016 22:23:00 +0000 (18:23 -0400)]
slab: Bootstrap more kmem_caches

This statically allocates all of the boot-strapping caches.  This is not
strictly necessary, but it could be if the hash table default size was
enough to make a kmem_cache a large slab object.  At that point, we'd need
all three bootstrap caches to allocate one.  This way, we have less
bootstrapping to worry about.  We'll have more to worry about when we start
using magazines.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoslab: Use BSD_LISTs for the bufctls
Barret Rhoden [Sun, 30 Oct 2016 22:23:00 +0000 (18:23 -0400)]
slab: Use BSD_LISTs for the bufctls

The slab allocator has a long-standing TODO: BUF.  Instead of using a
hash table to lookup a large object, we just used storage in the object
itself.  This was okay, other than possible fragmentation effects, but
it meant that the slab allocator touched every object it tracked.  We'll
eventually need an option to have "NO_TOUCH" slab allocators, where they
do not touch the objects they are tracking.  To do that, we'll need a
hash table.

This commit switches the bufctl struct from a TAILQ to a BSD_LIST, which
will make the hash table entries smaller.  This also fixes a FOREACH
freeing bug. (use FOREACH_SAFE).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoSet num_cores early in boot
Barret Rhoden [Sun, 6 Nov 2016 16:21:02 +0000 (11:21 -0500)]
Set num_cores early in boot

The memory allocator will need to know the number of cores in the
system when it is initialized.  In the future, it may also need to know
the number of NUMA domains.  Determining the number of cores is somewhat
arch-specific.  We can do it with ACPI on x86, and on any other platform
that supports it.

Our ACPI code relies on the memory allocator and does a lot more than
determine the number of cores, so we have a simple helper that just
looks at the ACPI tables, finds the XSDT, then finds the MADT, then
counts the local apics.  We'll use this as num_cores (possibly an
overestimate).  The topology code will make sure we didn't
underestimate later in boot.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoCheck booting during trace_printk()
Barret Rhoden [Sun, 6 Nov 2016 16:18:52 +0000 (11:18 -0500)]
Check booting during trace_printk()

Instead of num_cores.  This is safer, in case we set num_cores before
various per-cpu structures are set up.

The reason for this is that the memory allocator will need to know about
num_cores, and that will happen very early in the booting process.

trace_printk() will be fine if it just uses the boot object instead of
per-cpu objects during boot.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoMoving 'booting' to a header
Barret Rhoden [Sun, 6 Nov 2016 16:17:56 +0000 (11:17 -0500)]
Moving 'booting' to a header

Instead of externing it in random places.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoReplace the old page allocator with the base arena
Barret Rhoden [Fri, 28 Oct 2016 20:53:46 +0000 (16:53 -0400)]
Replace the old page allocator with the base arena

The old allocator couldn't handle higher order allocations efficiently.
As memory was used, it'd take longer and longer to find contiguous
pages.

We bootstrap the base arena and add free segments to it based on the
free memory regions of multiboot.  The kpages_arena is used for the
main pages allocator.  Right now, it's just a pass-through arena that
imports from base.  In the future, it'll have its own qcaches built in,
which will make common allocations even faster.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd the arena allocator
Barret Rhoden [Fri, 28 Oct 2016 20:22:26 +0000 (16:22 -0400)]
Add the arena allocator

The arena allocator is based off of the Vmem allocator:

http://www.google.com/search?q=bonwick+vmem

This will be the basis for all memory allocation.  Right now, it does
not have integrated qcaches (slabs).  That will require some work with
the slab allocator.  You can build a jumbo page allocator, using a
helper that xallocs with an alignment, which is pretty cool.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd hash_helper.h for custom dynamic hash tables
Barret Rhoden [Tue, 1 Nov 2016 00:25:21 +0000 (20:25 -0400)]
Add hash_helper.h for custom dynamic hash tables

The full-fledged dynamic hashtable.c doesn't work for a lot of code that
needs more control over its hash table.  For instance, the arena
allocator needs fine-grained control over allocations and a node's list
membership.

This header is a few building-block helpers that allow you to build your
own dynamically resized hash table.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoPort hash.h
Barret Rhoden [Mon, 31 Oct 2016 23:30:31 +0000 (19:30 -0400)]
Port hash.h

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoImport hash.h from Linux
Barret Rhoden [Mon, 31 Oct 2016 23:25:09 +0000 (19:25 -0400)]
Import hash.h from Linux

Version 4.6, which was before all the arch-specific additions.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoAdd a #define for all MEM_FLAGS
Barret Rhoden [Fri, 28 Oct 2016 20:21:42 +0000 (16:21 -0400)]
Add a #define for all MEM_FLAGS

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove mon_gfp()
Barret Rhoden [Thu, 27 Oct 2016 00:24:47 +0000 (20:24 -0400)]
Remove mon_gfp()

Unused, and it was a hack into the old allocator.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agomlx4: Remove page_is_free() safety check
Barret Rhoden [Thu, 27 Oct 2016 00:11:28 +0000 (20:11 -0400)]
mlx4: Remove page_is_free() safety check

With the arena allocator, it won't be easy to query the state of a given
page.  We actually could do that, if we wanted, with an arena helper
that looks up the btag for a given address, but it's a pain, it won't be
fast, and it will probably not work well with NUMA.

Considering this style of page pinning needs to change anyways, we might
as well remove it.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agox86: Stop freeing the trampoline page
Barret Rhoden [Thu, 27 Oct 2016 00:07:42 +0000 (20:07 -0400)]
x86: Stop freeing the trampoline page

The arena allocator won't let us free something it never allocated.  The
pages[] based allocator didn't care, since we massaged the refcnts the
right way during page_alloc_init().

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoJump to a real kstack ASAP during boot
Barret Rhoden [Wed, 26 Oct 2016 23:40:50 +0000 (19:40 -0400)]
Jump to a real kstack ASAP during boot

We actually were using the bootstack, which was never actually given out
by a memory allocator, for a long time.  Eventually, we'd give it back,
when the kthread code thought it was a spare it needed to free.  This
would confuse the arena allocator, which never gave out the memory in
the first place.

Now, we'll switch to using a kernel stack that was given to us by
get_kstack() right away.  This helps both with the allocator as well as
with whatever safety checks we'll use for the kernel stacks (e.g. guard
pages).  It'd be brutal if we had one unlucky kernel stack that didn't
have the protections we thought all stacks had (or will have, in this
case).

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agox86: set pcpui->{ts,gdt} early
Barret Rhoden [Wed, 26 Oct 2016 22:17:32 +0000 (18:17 -0400)]
x86: set pcpui->{ts,gdt} early

This allows us to set/get the stacktop with the usual, arch-independent
helper early.  I'll need this during init, before smp_boot.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoUse a helper for resetting kernel stacks
Barret Rhoden [Wed, 26 Oct 2016 20:40:37 +0000 (16:40 -0400)]
Use a helper for resetting kernel stacks

It's another arch-specific helper, but I have another case in an
upcoming commit that will need to pass the function pointer.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoIntegrate rbtrees into Akaros
Barret Rhoden [Thu, 13 Oct 2016 15:24:29 +0000 (11:24 -0400)]
Integrate rbtrees into Akaros

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoImport rbtrees from Linux
Barret Rhoden [Thu, 13 Oct 2016 14:57:04 +0000 (10:57 -0400)]
Import rbtrees from Linux

From Linux commit 9a2172a8d52c ("MAINTAINERS: Switch to kernel.org email
address for Javi Merino")

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoMove __always_inline to compiler.h
Barret Rhoden [Thu, 13 Oct 2016 15:15:26 +0000 (11:15 -0400)]
Move __always_inline to compiler.h

So other code can use it.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove get_cont_phys_pages_at()
Barret Rhoden [Thu, 18 Aug 2016 17:42:20 +0000 (13:42 -0400)]
Remove get_cont_phys_pages_at()

I think this was for some weird debugging code, or maybe the old NIX mode.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove page coloring
Barret Rhoden [Thu, 18 Aug 2016 16:02:22 +0000 (12:02 -0400)]
Remove page coloring

Page coloring doesn't work with contiguous memory allocators, and it
partitions all levels of the cache hierarchy, which doesn't work well with
spatial partitioning.  For instance, if we partition the L3 into 8 colors
(the number is based on the cache properties), we might be partitioning the
L1 and L2 into two colors (again, based on cache properties).  Although we
now have cache isolation in the shared LLC, we also partition a cache that
is already per-core.

The better approach is to use some sort of hardware support, such as
Intel's Cache Allocation Technology.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove page refcnts
Barret Rhoden [Wed, 17 Aug 2016 21:22:10 +0000 (17:22 -0400)]
Remove page refcnts

page_decref() is just page_free(), now.  I'll do a rename in a later
commit.  We still needed to track if it was free or not for the currently
lousy memory allocator.

There might be issues with this, but if you aren't willing to potentially
break compatibility with Linux, then you'll never get anywhere.

There are a few reasons to do reference counts.  Only one we still have is
for devices that want to pin user memory for operations.  Specifically, the
mlx4 OS-bypass stuff does this.  The problem is that the user allocs memory
and gives arbitrary addresses to the device.  Instead, we should have the
device own the memory and let the user mmap the memory.  That gets rid of
any issues with locking the page, since the memory is always 'safe.'

That model doesn't work with traditional scatter-gather.  Worst case, we
can come up with something where we lock the VMR, instead of the page.
Though I'd rather come up with more explicit block data transfer
interfaces.

Note that the mlx4 OS-bypass is extremely dangerous now.  I think it was
always leaking memory before, btw.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoProvide a shim layer for reference counted pages
Barret Rhoden [Tue, 23 Aug 2016 20:54:38 +0000 (16:54 -0400)]
Provide a shim layer for reference counted pages

Right now, all pages are reference counted.  I'd like to try to stop doing
that to make contig allocations and maybe jumbo pages easier.  Longer term,
I'd like to get away from having a page struct too, though we'll see.

Some code, specifically mlx4, wants page allocations and to do reference
counting per page.  For that code, we provide this shim.

It actually looks like there are some bugs in mlx4's allocation/freeing
code, and how they account for fragments and references for higher-order
allocations.  Linux 4.7 seems to have the same structure, though perhaps
their are different semantics there.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRefactor map_page_at_addr
Barret Rhoden [Tue, 16 Aug 2016 19:52:04 +0000 (15:52 -0400)]
Refactor map_page_at_addr

The pte_is_mapped() case was a little sketchy.  The page_is_pagemap() check
was a little hard to follow; it's easier if the caller tells us what to do,
instead of us inferring what to do.

This also fixes a memory leak in __hpf, where if we failed to map a
non-page-map page, we neglected to free it.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix bounds checks and misc errors in mm.c
Barret Rhoden [Tue, 16 Aug 2016 20:23:41 +0000 (16:23 -0400)]
Fix bounds checks and misc errors in mm.c

Some of the UMAPTOP checks could be overflowed.  There are probably more
throughout the kernel (though not for UMAPTOP).  Using the umem helper
simplifies the logic a bit.

For those curious, mprotect()s ENOMEM errno is what the man page says to
do, even though the others do EINVAL.

The printk change for create_vmr's failure is in the hopes of catching a
bug.  I occasionally see this:

cs has not created #srv/cs yet, spinning until it does....

kernel warning at kern/src/mm.c:103, from core 0: Not making a VMR,
        wanted 0x0000400000000000, + 0x00003b5100001000 = 0x00007b5100001000

[kernel] do_mmap() aborted for 0x0000400000000000 + 4096!

The do_mmap()'s printk would have truncated the top part of len (0x3b51),
if it was passed.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRemove SYS_cache_buster (XCC)
Barret Rhoden [Tue, 16 Aug 2016 17:59:56 +0000 (13:59 -0400)]
Remove SYS_cache_buster (XCC)

"You're killing me, Buster."

Reinstall your kernel headers.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix extra decref of shared_page
Barret Rhoden [Tue, 16 Aug 2016 17:55:35 +0000 (13:55 -0400)]
Fix extra decref of shared_page

We should never be freeing shared_page once it is allocated.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoMake page_insert() consume the caller's refcnt
Barret Rhoden [Tue, 16 Aug 2016 17:48:49 +0000 (13:48 -0400)]
Make page_insert() consume the caller's refcnt

The page refcounting needs to go.  The refcnt was from a time when a page
could have multiple objects tracking it independently.  Nowadays that is
handled higher up, such as in the page cache.

For the most part, the freeing/allocating of the memory is handled higher
up in the stack.  We were already doing this with e.g. procinfo, where we
would free it twice, doing double the work necessary.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix the remaining /dev/ -> /dev_vfs/
Barret Rhoden [Wed, 26 Oct 2016 23:50:33 +0000 (19:50 -0400)]
Fix the remaining /dev/ -> /dev_vfs/

Probably the last ones.  =)

This only affected you if you attempted to build the ancient EXT2
support.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoVMM: Fix virtio-net bytestostrip initialization
Barret Rhoden [Mon, 28 Nov 2016 15:24:28 +0000 (10:24 -0500)]
VMM: Fix virtio-net bytestostrip initialization

Needs to be initialized in per-virtio loop.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix slow uthread context switches
Barret Rhoden [Tue, 8 Nov 2016 16:16:44 +0000 (11:16 -0500)]
Fix slow uthread context switches

The lock addq is accessing 8 bytes, but we only need to access one byte.
Accessing 8 bytes could span a cacheline boundary, which it does currently.
Doing so causes two cache misses!

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmm: allow a vmm to override the vmcall function
Ronald G. Minnich [Tue, 29 Nov 2016 01:23:45 +0000 (17:23 -0800)]
vmm: allow a vmm to override the vmcall function

Add a vmcall struct to the guest thread struct. This
allows us, on a guest thread by guest thread basis, to
support vmcalls.

I've tested this with dune and it works fine.
Longer term, we may want to define an ops structure
but I think that's rushing it a bit.

Change-Id: Ic381f0e70946ba2396303e5d6428bc999ec4b6dd
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmx: Add and use constants for PML and TSC Scaling
Fergus Simpson [Tue, 29 Nov 2016 01:35:39 +0000 (17:35 -0800)]
vmx: Add and use constants for PML and TSC Scaling

This adds definitions for secondary processor-based VM-Execution
controls "Enable PML" and "TSC Scaling".

The need for attempting to unset Enable PML was discovered on a
Broadwell-DE system and TSC Scaling was previously an undocumented
constant.

Change-Id: If4eec1f43da084d6f1c3764c31f7075a9f5605d3
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoRewrite _rock_get_listen_fd to make it simpler. (XCC)
Dan Cross [Mon, 21 Nov 2016 20:57:11 +0000 (15:57 -0500)]
Rewrite _rock_get_listen_fd to make it simpler. (XCC)

Use `strrchr` to find the last '/' in the source string when
finding the 'ctl' component.  More verbose error reporting and
assertions.

Rebuild glibc.

Change-Id: I176170d96130403b1e2fa42506caa50a02712e32
Signed-off-by: Dan Cross <crossd@gmail.com>
[xcc warning]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix random checkpatch warnings and errors in plan9_sockets.c.
Dan Cross [Mon, 21 Nov 2016 20:56:07 +0000 (15:56 -0500)]
Fix random checkpatch warnings and errors in plan9_sockets.c.

Change-Id: I80b4310e76f84a57cab01045b741a199148e1b51
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agouser/vmm: fflush stdout on every write
Ronald G. Minnich [Tue, 22 Nov 2016 00:01:03 +0000 (16:01 -0800)]
user/vmm: fflush stdout on every write

Things are not reliable enough yet to assume a final fflush
on stdout will happen. Just fflush on every character.

Change-Id: Ib24b6844205849b7d50882ff1724bd46a19ba4b3
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agodune: clean up and remove lots of cruft
Ronald G. Minnich [Mon, 21 Nov 2016 17:22:20 +0000 (09:22 -0800)]
dune: clean up and remove lots of cruft

Dune was derived from test programs first written in early 2015.
We might as well take the opportunity to decruft it.

Change-Id: I955f3f64ab3e387d9f093f5fea158fa3c1d4c8e9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agodune: add a dune command
Ronald G. Minnich [Mon, 21 Nov 2016 17:03:50 +0000 (09:03 -0800)]
dune: add a dune command

This is much like the Stanford dune system in that it is
designed to run simple non-kernels that support user
mode programs. It lets us show the ease of implementation
of such a command in the Akaros VM model.

To start, this is just a clone of vmrunkernel.

Change-Id: I2ac0fdddd3e834e6d9ea06d75c166a60d1fb4775
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agouser/vmm: print the RSP as well as RIP
Ronald G. Minnich [Mon, 21 Nov 2016 16:55:13 +0000 (08:55 -0800)]
user/vmm: print the RSP as well as RIP

Change-Id: I2f3df21c7a68dd3bde7142b6ba4f255ad62ad9f7
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoSimplify block_alloc function
Fergus Simpson [Thu, 17 Nov 2016 17:35:33 +0000 (09:35 -0800)]
Simplify block_alloc function

Removed optimization from Plan 9 where the driver would attempt to make
use of extra memory reserved by malloc. Akaros does not currently have
the capability to get the real size of the reserved memory, so leaving
the optimization in just resulted in some complicated pointer arithmetic
that always yielded the defined constant Hdrspc.

The optimization has been left in comments in case Akaros ever gets the
ability to get the actual size of reserved memory.

Also added an assert that Hdrspc is aligned to BLOCKALIGN - if it were
not then Hdrspc would randomly be truncated by up to Hdrspc%BLOCKALIGN
bytes.

Change-Id: I5249df6fdd8f47f0f07b35fcf3f7fed45f61d383
Signed-off-by: Fergus Simpson <afergs@google.com>
[removed mlx4 references]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agoFix virtio net handling of the header.
Gan Shun [Thu, 17 Nov 2016 18:49:18 +0000 (10:49 -0800)]
Fix virtio net handling of the header.

We weren't stripping the header off correctly, and we didn't handle the
case where the guest would use a separate iov for the virtio net header.
This commit properly finds the offset where the ethernet frame begins
and writes that to the NIC.

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I6a2ad870d00752a60386bfde8b7b01287f95899d
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmrunkernel: add option to set the stack
Ronald G. Minnich [Tue, 15 Nov 2016 00:01:52 +0000 (16:01 -0800)]
vmrunkernel: add option to set the stack

In most cases we don't want to set the stack,
but add an option so we can set it it needed.

Change-Id: I686211b723acfe6efc86a4fc01c1c89c52659d70
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmrunkernel: get rid of coreboot tables
Ronald G. Minnich [Mon, 14 Nov 2016 23:53:28 +0000 (15:53 -0800)]
vmrunkernel: get rid of coreboot tables

Maybe we will need them someday but not now.

Change-Id: Ib731eef45a43f6059c1c9fbf8918b771814ca723
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
2 years agovmrunkernel: allow -M for setting memory start
Ronald G. Minnich [Mon, 14 Nov 2016 23:18:12 +0000 (15:18 -0800)]
vmrunkernel: allow -M for setting memory start

And, as part of finding compiler warnings to make
this work, do some cleanup.

Oh, and as part seeing the help message
was woefully wrong, fix that too by having it
print the contents of the options struct,
not a string that will keep getting wrong :-)

Change-Id: I98b25095ff2f1255afbf1257d56197b1f6bc8d08
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[formatting nits]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdding documentation for using adt and gerrit to Contributing.md
Gan Shun [Wed, 9 Nov 2016 22:36:17 +0000 (14:36 -0800)]
Adding documentation for using adt and gerrit to Contributing.md

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I4b58d08e88d570c1d237f7cb14ef79fd21654940
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agovmrunkernel: remove statically allocated _kernel[]
Ronald G. Minnich [Thu, 3 Nov 2016 20:05:49 +0000 (13:05 -0700)]
vmrunkernel: remove statically allocated _kernel[]

kernel memory is now dynamically allocated.
It always starts at 16 MiB, a good choice for linux.
It defaults to 1GiB but you can change the size
via -m.

The startup code makes sure that __procinfo.program_end
is < 16 MiB, and that 16 MiB + memsize does not intrude into
BRK_START.

We also don't use MAP_FIXED. Rather, we test after
the mmap that we got the address we want. This
ensures that we got our mapping and that we did
not get it at the expense of unmapping something else.
It's a more conservative test than using MAP_FIXED
and testing for MAP_FAILED.

Tested to booting a linux kernel.

Change-Id: I6dc2c8e729f27c143e38f53a229e84ab145fb051
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdded ADT script
Gan Shun [Wed, 2 Nov 2016 17:35:23 +0000 (10:35 -0700)]
Added ADT script

This allows us to easily push to gerrit and set up custom reviewers and
topics. The topic defaults to the local branch name unless otherwise
specified.

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I841ed157ef6d663d718368652654b0b6039bdc7a
[removed blank at EOF]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agovmrunkernel: load the file using the ELF library
Ronald G. Minnich [Tue, 1 Nov 2016 16:41:38 +0000 (09:41 -0700)]
vmrunkernel: load the file using the ELF library

This has been used to boot a full Linux kernel environment
to multiuser.

Change-Id: I9ba0ef062f05994225358e92a24de2d7934c8cd9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd a script to help review gerrit patch sets
Barret Rhoden [Tue, 1 Nov 2016 17:36:56 +0000 (13:36 -0400)]
Add a script to help review gerrit patch sets

Like git track-review, this grabs a branch for a gerrit change (with git
gerrit-track), extracts it into patches, and runs checkpatch.

It could just as easily call git checkpatch, but breaking it into .patches
helps a little.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoReorder the top level Makefile so that full builds work again
Ronald G. Minnich [Mon, 31 Oct 2016 22:56:28 +0000 (15:56 -0700)]
Reorder the top level Makefile so that full builds work again

Otherwise, they fail, as gelf.h is not installed when
make tests runs.

Change-Id: If19d8515706a7a43ccd37bf2e60fbf88ce4cd581
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoDocfix: changed obj/kernel to obj/kern
Fergus Simpson [Tue, 1 Nov 2016 00:43:30 +0000 (17:43 -0700)]
Docfix: changed obj/kernel to obj/kern

There is no kernel folder in obj, just kern.

Change-Id: Id3f901fc0c347cb5e0c5fa220ce83f7338199770
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAdd script to track a particular gerrit change
Gan Shun [Fri, 28 Oct 2016 19:30:27 +0000 (12:30 -0700)]
Add script to track a particular gerrit change

This script pulls the latest patch-set from gerrit for a particular
change and creates a branch

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I56120268935ecca38b978a8e519d5cab6430e70f

3 years agoMove the BRK_START to a fixed, safe address (XCC)
Barret Rhoden [Wed, 26 Oct 2016 19:19:07 +0000 (15:19 -0400)]
Move the BRK_START to a fixed, safe address (XCC)

The VM code often wants to mmap blobs at various fixed addresses, such
as the guest kernel.  Our old glibc heap would start right at the top of
the program's loading point, which meant that we couldn't safely use any
of that memory.  The current vmrunkernel just has a huge array that
covers the memory regions it expects to use.  This is less than ideal.

This commit just specifies a region of the process's virtual address
space that glibc will use for its sbrk() allocations (e.g. malloc()).
Any program can safely mmap with MAP_FIXED below this address (up to the
binary's end point, which the kernel reports in procinfo->program_end.

Here's a before and after.  Note the old 0x21000 bytes has moved from
0x647000 to its new location at 0x100000000000.

bash-4.3$ cat /proc/self/maps
00100000-00120000 rwxp 00000000 01:00 146 /lib/ld-2.19.so
00320000-00321000 r--p 00020000 01:00 146 /lib/ld-2.19.so
00321000-00322000 rw-p 00021000 01:00 146 /lib/ld-2.19.so
00322000-00323000 rw-p 00000000 00:00 0 [heap]
00400000-00443000 r-x- 00000000 01:00 102 /bin/busybox
00443000-00444000 r-xp 00043000 01:00 102 /bin/busybox
00643000-00644000 rw-p 00043000 01:00 102 /bin/busybox
00644000-00647000 rw-p 00000000 00:00 0 [heap]
00647000-00668000 rwx- 00000000 00:00 0 [heap]
400000000000-400000001000 rw-p 00000000 01:00 146 /lib/ld-2.19.so
400000001000-400000002000 rw-p 00000000 00:00 0 [heap]
400000002000-400000141000 r-xp 00000000 01:00 182 /lib/libc-2.19.so
400000141000-400000341000 ---p 0013f000 01:00 182 /lib/libc-2.19.so
400000341000-400000345000 r--p 0013f000 01:00 182 /lib/libc-2.19.so
400000345000-400000347000 rw-p 00143000 01:00 182 /lib/libc-2.19.so
400000347000-40000034a000 rw-p 00000000 00:00 0 [heap]
40000034a000-40000034b000 rw-p 00000000 00:00 0 [heap]
40000034b000-40000034f000 rw-- 00000000 00:00 0 [heap]
40000034f000-400000351000 rwx- 00000000 00:00 0 [heap]
400000351000-400000353000 rw-- 00000000 00:00 0 [heap]
7f7fff8ff000-7f7fff9ff000 rw-- 00000000 00:00 0 [heap]

bash-4.3$ cat /proc/self/maps
00100000-00120000 rwxp 00000000 01:00 146 /lib/ld-2.19.so
00320000-00321000 r--p 00020000 01:00 146 /lib/ld-2.19.so
00321000-00322000 rw-p 00021000 01:00 146 /lib/ld-2.19.so
00322000-00323000 rw-p 00000000 00:00 0 [heap]
00400000-00443000 r-x- 00000000 01:00 102 /bin/busybox
00443000-00444000 r-xp 00043000 01:00 102 /bin/busybox
00643000-00644000 rw-p 00043000 01:00 102 /bin/busybox
00644000-00647000 rw-p 00000000 00:00 0 [heap]
100000000000-100000021000 rwx- 00000000 00:00 0 [heap]
400000000000-400000001000 rw-p 00000000 01:00 146 /lib/ld-2.19.so
400000001000-400000002000 rw-p 00000000 00:00 0 [heap]
400000002000-400000141000 r-xp 00000000 01:00 182 /lib/libc-2.19.so
400000141000-400000341000 ---p 0013f000 01:00 182 /lib/libc-2.19.so
400000341000-400000345000 r--p 0013f000 01:00 182 /lib/libc-2.19.so
400000345000-400000347000 rw-p 00143000 01:00 182 /lib/libc-2.19.so
400000347000-40000034a000 rw-p 00000000 00:00 0 [heap]
40000034a000-40000034b000 rw-p 00000000 00:00 0 [heap]
40000034b000-40000034f000 rw-- 00000000 00:00 0 [heap]
40000034f000-400000351000 rwx- 00000000 00:00 0 [heap]
400000351000-400000353000 rw-- 00000000 00:00 0 [heap]
7f7fff8ff000-7f7fff9ff000 rw-- 00000000 00:00 0 [heap]

Rebuild glibc.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove proc->heap_top
Barret Rhoden [Wed, 26 Oct 2016 19:40:40 +0000 (15:40 -0400)]
Remove proc->heap_top

Unused.

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoRemove init=/bin/sh from vmimage_cmdine
Gan Shun [Wed, 26 Oct 2016 18:08:58 +0000 (11:08 -0700)]
Remove init=/bin/sh from vmimage_cmdine

We no longer need that to boot all the time, so I'm removing it from the
defaults.

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I18e7023475a8de4abf0588dbc7c298ccd6632e89
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoDelete unsupported entries for userspace MSR handling.
Gan Shun [Wed, 26 Oct 2016 18:08:57 +0000 (11:08 -0700)]
Delete unsupported entries for userspace MSR handling.

We can't handle most of these emulated MSRs because we don't actually read
and write the MSR in userspace. Removing them from the emmsr array

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I127adf7ef346df7a5aeb3959b4b41afc25921c49
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoFix IA32_MISCENABLE disabling of PEBS
Gan Shun [Wed, 26 Oct 2016 18:08:56 +0000 (11:08 -0700)]
Fix IA32_MISCENABLE disabling of PEBS

We weren't correctly checking the written value. We tell the guest that
PEBS is disabled, thus when they write the same value back to the MSR, we
should check for the disable bit in miscenable

Signed-off-by: Gan Shun <ganshun@gmail.com>
Change-Id: I0e00119d7fec678e2c4e3b2185565444022ac140
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Prevent sign extension of partial address
Fergus Simpson [Thu, 20 Oct 2016 19:00:55 +0000 (12:00 -0700)]
AHCI: Prevent sign extension of partial address

Drive reads were not working past the 1 TiB mark because the resulting
address was negative. This was determined to be an issue with an
unsigned char getting sign extended when bit shifted into an int64_t.
It is now cast to a uint32_t after the shift to prevent sign extension.
The container was also changed from int64_t to uint64_t.

Change-Id: I590b0da4fd0c02b0e2542a0b65bde510bba89525
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Skip device permission check
Fergus Simpson [Tue, 18 Oct 2016 00:11:26 +0000 (17:11 -0700)]
AHCI: Skip device permission check

Permissions are not currently implemented in Akaros. This change simply
makes it so that devpermcheck(...) always returns before throwing an
error that would result in permission being denied. This allows the
device to actually be used while even though permissions have not been
implemented.

Change-Id: Ic2f19071803bba497d916031a22bcbc0b70e8ffd
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Add C600 HBA and fix PCI iteration bugs
Fergus Simpson [Tue, 18 Oct 2016 00:09:30 +0000 (17:09 -0700)]
AHCI: Add C600 HBA and fix PCI iteration bugs

This commit adds the C600 HBA to the list of recognized Intel HBAs so my
machine with one can use it and makes two fixes to the PCI driver.

It also fixed a bug in the PCI driver. When detecting PCI devices it
iterates over all functions on all devices. A device can have up to 8
functions (0-7) and the driver assumes they are sequential, giving up
when one is not found. This should not be done. A device is detected by
whether function 0 is implemented - if it is not no device is connected.
While a device must implement function 0, it does not need to implement
its other functions sequentially. The C600 for example implements 0, 2,
3, so the driver did not detect functions 2 and 3 and the HBA did not
work. The driver has been changed so that it will only give up if
function 0 is not found.

Another issue was fixed with the PCI driver where it would not detect
devices on bus 0xff - the last bus. There was a comment about issues
with bus 0xff but that doesn't seem to be an issue any more so the
driver will now check the last bus.

Change-Id: I8dcac3f27b4983a9141e5700d73a758389cef75a
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Replace MMIO accesses with helper functions
Fergus Simpson [Mon, 17 Oct 2016 22:08:51 +0000 (15:08 -0700)]
AHCI: Replace MMIO accesses with helper functions

This commit removes all pointer accesses to MMIO by removing the
structs that represented MMIO. They have been replaced by helper
functions that use volatile accesses to make sure that the reads
and writes always happen. Instead of structs, blocks of memory are
simply used that are indexed into using constants that represent each
register. All virtual addresses are represented by void pointers, and
all physical addresses would be represented by uintptr_t types; however,
all physical addresses are stored in MMIO and hence are only accessed
with the provided helper functions. They are only written to as
addresses needed by the HBA. The host keeps its own references to the
structs as void pointers to virtual memory.

Change-Id: Ia62cd57797ca8db9f21f47559c524149ad6fc11e
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Fix hardware address gets in driver
Fergus Simpson [Mon, 17 Oct 2016 21:55:51 +0000 (14:55 -0700)]
AHCI: Fix hardware address gets in driver

The AHCI driver was using PCIWADDR(ptr) to get the physical address of
memory mapped structs, but only assinging it to the lower 32 bits of
any address field and setting the upper 32-bits to 0. AHCI's memory
mapped structs use 32-bit regsters so both halves are stored in
sequential registers.

This fix uses paddr_low32(ptr) and paddr_hgih32(ptr) to get both halves
of the address.

This should fix issues that occur when a memory mapped struct is outside
of the 32-bit address space.

Change-Id: I8e5ef62c580cc002510ccabadef9c2fcf0153bc8
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: Remove struct typedefs from driver
Fergus Simpson [Mon, 17 Oct 2016 21:54:12 +0000 (14:54 -0700)]
AHCI: Remove struct typedefs from driver

This makes the driver more consistent with the rest of Akaros's code.

Change-Id: I427e439ee1b34a2bcf5ec86c94e4f590c0681ee5
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoAHCI: get it to build and almost work.
Ronald G. Minnich [Mon, 17 Oct 2016 21:36:44 +0000 (14:36 -0700)]
AHCI: get it to build and almost work.

In qemu, it still shows the device as having zero bytes.
But it does find it.

Change-Id: I81262b460a9cd43a848c1d782c109ec216afb795
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Fergus Simpson <afergs@google.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agomlx4: /dev/ -> /dev_vfs/
Barret Rhoden [Tue, 18 Oct 2016 18:21:57 +0000 (14:21 -0400)]
mlx4: /dev/ -> /dev_vfs/

Fixes the mlx4 driver in accordance with commit 9724f9a56650 ("Move VFS
/dev/ -> /dev_vfs/").

Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoConvert the capability device to use SHA1
Ronald G. Minnich [Fri, 14 Oct 2016 22:43:03 +0000 (15:43 -0700)]
Convert the capability device to use SHA1

This involves a minor code change but I take the opportunity
to clean things up, getting rid of files we don't need,
and fixing includes.

Change-Id: Ie9ead4b6a2473d2f25b7b0a777343aef598f8dd9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability device: get it to compile
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:18 +0000 (13:26 -0700)]
capability device: get it to compile

We need to do something about the use of sha1.

Change-Id: I80795609ccea1ac629cb7b9d4a95040cc040d76a
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[whitespace]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability: run scripts/PLAN9 on capability device
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:17 +0000 (13:26 -0700)]
capability: run scripts/PLAN9 on capability device

Change-Id: I55dbed3e636730c4768c61168f22c61c9e2c82fb
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
[sizeof ()s]
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocapability: clang-format the capability device
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:16 +0000 (13:26 -0700)]
capability: clang-format the capability device

Change-Id: I3e99b8317fc57fbfb775fd4242e5fb2f36411a46
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoadd the capability device from Harvey (from Plan 9)
Ronald G. Minnich [Fri, 14 Oct 2016 20:26:15 +0000 (13:26 -0700)]
add the capability device from Harvey (from Plan 9)

Change-Id: If159e72517809eedd0d1e98271e3dde57e035090
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: get sha256 support to build.
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:04 +0000 (13:39 -0700)]
crypto: get sha256 support to build.

For now we'll just go with the sh256.c. That said,
we'll keep the other bits in here. Sooner or later we may
need the other crypto functions. Note these are not compiled
in conditionally.

We should consider removing the conditional compiling
of the unrolled code; we don't have space constraints of firmware.

Change-Id: Ic792cf2b89fa4f01a94c420eb3c620b62c7bf2a9
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: move includes to kern/include
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:03 +0000 (13:39 -0700)]
crypto: move includes to kern/include

Change-Id: Id9e62496bb6595a7f282dfa26bd1fa1cbdac8bb4
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agocrypto: initial import of the chromeos vboot libraries
Ronald G. Minnich [Thu, 13 Oct 2016 20:39:02 +0000 (13:39 -0700)]
crypto: initial import of the chromeos vboot libraries

This code is needed to support the capability device, imported
in a separate commit. This is recommended as a 'best' version
of these algorithms by a security expert at Google.

This is from  https://chromium.googlesource.com/chromiumos/platform/vboot_reference
ref 3b55afa94e84c91874fcdad352b4053036886aa7

Change-Id: Ie3d90f183df990fd5bde6dfd83efbbd1e9b6009b
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>
3 years agoifconfig: invoke ipconfig with `-P` when configuring loopback.
Dan Cross [Fri, 7 Oct 2016 20:18:30 +0000 (16:18 -0400)]
ifconfig: invoke ipconfig with `-P` when configuring loopback.

Don't overwrite cached data from the DHCP server.

Change-Id: Ie7d3ad4be5d9cf6aeb4def7d8c47ffefe522c80d
Signed-off-by: Dan Cross <crossd@gmail.com>
Signed-off-by: Barret Rhoden <brho@cs.berkeley.edu>