{\rtf1\ansi\ansicpg1252\deff0{\fonttbl{\f0\fnil\fcharset0 Courier New;}{\f1\fswiss\fprq2\fcharset0 @Arial Unicode MS;}} {\*\generator Msftedit 5.41.21.2508;}\viewkind4\uc1\pard\lang3081\b\f0\fs28 IPC Design \b0\fs20\par \par Write up \par \par \b Overriding Goals\b0\par - Fast \par - Easy to maintain / Flexible \par - Reliable \par - Secure\par - Multi platform\par \par \par \par \b Key design points\b0\par Strongly Typed Messages inherited from base class\par Point to Point queues ( eg no routing /addressing) \par Priority in queues . \par Message may activate sleeping destination ( if destination desires) \par All messaging is Asynchronous. \par User apps , Kernel and Services are treated identically. \par 64 bit first class citizen support\par First class NUMA multi processor support. \par Heavy degree of isolation ( eg internal data types and very moduler)\par A Heavy setup cost low run cost approach \par Solid OO/modular design\par App to App is a 1 st class citizen\par \par \par \par \par \b More specifically how does this achieve our goals. \b0\par \par \par \b - Fast \b0\par \par We want to achieve an order of magnitude performance over Linux in IPC call. \par At this level the performance of the messenging system becomes less relevant ( accept for malloc see below) \par and we trade it for a more secure and flexible system. \par \par We do however allow a service waiting for a message to be immediately scheduled.\par \par Asynchronous nature allows fast low latency traversal if services are scheduled. \par ( As Synch callers are never scheduled , in this cases they can be services and get immeditaely scheduled when a reply arives) \par \par Alternatively it can provide high latency high troughput , by not scheduling immediately and then processing a large number of messages. \par As the clients are Asynch we dont need to reply immediately. \par \par Many OS have fast kernel calls but slow IPC , this is a killer for Micro kernels.\par \par \par \b - Easy to maintain / Flexible\b0 \par \par Having a single uniform IPC ( eg everything uses it app to app , app to kernel etc) is far easier to maintain than seperate ABI and IPC.\par Less code is always good.\par \par Again having no special rules for services and kernel makes IPC more simple and uniform. \par \par Messages are strongly typed to reduce potential errors( If performance does matter we can use UintPtr instead but a strongly typed system will result in less errors ) \par \par A single entry point into a service allows all messages to be validated and common code applied without having to call these in every method. \par \par By using messages we never need to change the code for new implementations as the base class will always be SystemMessage. \par \par Asynch systems by nature are more flexible , messages may be queued , or services may be moved to a different machine etc. Its even possible for 1 app to send and another app to receive.\par \par Naturally integrates with networks . eg a P2P message app would send the messages via UDP . \par \par \par \b - Reliable \b0\par \par Reliability is increased by using strong typing and ensuring only one writer has access to shared data. \par If the writer fails the data can be corrupt or in an undefined state , if a reader fails there is no issue with the shared data. \par \par Also system reliability can be improved by using queues for a service and ordering requests this prevents task freezing out high priority tasks due to sending large amounts of work ( eg bring up task manage when your machine is paging to death) , \par with Messages the high priority processes can be serviced first this also enables better real time support. \par \par \b - Secure\b0\par \par The queue system with capability guarantees who the sender is and provides for a lot of security , this is assisted by treating everything as a user process preventing the exploitation of services commonly seen in most hacks \par . \par \par Typed messeges also allow for more security and less tampering. \par \par Treating everything as a user app encourages Principal of Least Privelage. \par \par \par \b If there is no ABI how do applications start ?\b0\par \par There is at least a send Message call. When launching an applications a loader passes the information ( with a user key) to the OS which creates the process allong with queues and a key. \par \par \par \par \b Why is the GC not part of the IPC system ( ie a seperate service) ?\b0\par \par The GC needs to be in process , ANY overhead would be extermely costly ( eg creating lots of strings for a web site) . \par Its worth noting the GC which is in process communicates to the MM via IPC but these are chunky calls requesting many or large pages. \par In fact we hope the allocation of the GC is simply \par \par Mov R1 , DWORD [ObjectSize]\par LOCK XADD [GCNurseryPtr] , R1 ; is the LOCK needed ?\par ;Note R1 contains the old value of the pointer but this instruction is thread safe\par ADD R1, DWORD [ObjectSize]\par \par ; there may be a better way but this is thread safe even for NUMA architectures\par ; where other CPUs may update the memory. So an add and then read is not thread safe,\par \par \par \par \par \b Are the Messages routed ? \b0\par \par To avoid complex and weighty dispatching all messages are sent point to point ( direct to the destination) . \par To get a new destination you must request it and have the appropriate Capability. By default an app has a queue to the MM system and the Security Capability system. These queues are located on the capability.\par While the startup cost is higher it means \par - No routing needed\par - Messages do not need to carry the sender and receiver this quickly becomes a major benefit. \par - Its more secure as the connection is point to point we can identify the sender and it cant be forged. \par - Applications dont need to talk to the kernel .\par \par \par \b Why not route all messages ?\b0\par Basically while with routing you could prioritize better , you create a single bottleneck with locks and threading this would be an issue . \par In addition its easier to create deadlock loops. There may also need to be a kernel thread processing this queue ( or you have even more locking issues) which brings a whole range of issues.\par \par With regard to prioritization busy services will be able to do plenty in their own queues and since they understand the nature of their callers \par are better able to serve/ prioritize them . \par \par Anyway the queue system allows a client to send to any point they just have to ask first . This is more secure and just as flexible. \par \par \par \b\par How do we ensure that messages passed around do not violate address space of apps ?\b0\par \par The only thing passed around is a pointer to the message. The Message is located in the shared address space when a request is made to create a message it creates it on a shared Heap.\par \par \b\par What other options were considered for hangling messages?\b0\par \par Option1 . Have a single ABI call (everything else messages) that creates the messge on the shared mem Heap , objects referenced in the message will need to be copied. Note this does not need to be a system call.\par Option2 . Copy the message and objects on send ( and the address on the stack) ( Minix, Mach , L3/L4 apporach)\par \par Option 3 is really the same as Option 1 and 2 just the ABI is in the GC . I dont like the copy messages on send . \par Even though it CAN be very efficient for small messages it doesnt support shared memory. The main reason its used by those OS is due to the high IPC costs.\par By using a shared heap we can send quite large objects via reference this means we DONT need to copy data from the network driver to the higher layers. \par The NIC can place the data immediately in the shared memory Heap.\par \par \par \b Why 64 bit and NUMA support ?\b0 \par \par All PCs built in the future will be 64 bit and multi core. It is trivial to add this now but a real pain later. \par \par \b How do the Queues work.\b0\par \par When an application is created it has 3 sets of queues . Each set consists of a SendMessage method which maps to a receiving queuue at the receiver as well as a queue to receive messages from the receiver. \par By default an application has a set for the following\par - Security \par - Memory Manager\par - Dispatcher\par \par Additional queue pairs maybe created to ANY process however a strong security check is done at the time of creation. Messages are created in shared memory and hence can map directly to large amounts of data eg a 1G file . \par \par \par \par \b\par How does the IPC actually work ?\b0\par \par The hard part is the last question , basically the message pointer is passed in a mem location, and its a pointer to the shared memory heap. \par We cant use the stack here unless we copy it which involves an intermediatery copying it. \par \par In more detail\par Client Thread\par UserCode \par Create Message via CreateMessageAPI [Cosmos.GC]\par Message Created on Shared Memory Heap [Cosmos.SharedMemory]\par \par \par \tab queue.SendMessage(message, bool yield) (Ben.IPC)\par \tab \tab\tab ProcessMessage(message)\par \tab\tab if ( yield) \par \tab Jump to dispatcher.yieldcurrentThread. \par \par \par \par ProcessMessage\par \{\par If ( receiver is blocked wait ing for message) \par \{\par Set Receiver current message to message\par Set Receiver current message sender to send capability. \par Schedule Receiver \par \}\par else\par \{\par block caller\par \tab\tab\tab\tab Set GC of thread to destination\par if ( high priority) \par Add message to front of destination queue) \par else\par Add message to back of destination queue\par \par \tab\tab\tab\tab Restore GC. \par \}\par return; // end Syscall\par \}\par \par \par Receiver\par Option 1) Good for VERY busy services\par \par foreach (var queue in input queues) // by priority\par if (queue.Count != 0)\par ProcessMessage ( queue.SenderKey , queue.Dequeue())\par \par \par \par Option 2) Override Add to queue , good for most apps \par OnAddToQueue(Message)\par AddMessageToSingleQueue ( queue.SenderKey , Message) \par \par \par \b How is scheduling done ?\b0\par After a message is sent an application is automatically stopped . However if their is quanta left it is placed at the head of the queue. It should be noted the only case for normally sending multiple messages is multiple calls or a broadcast , we support this by allowing a 2nd Syscall which sends multiple messages ( or the same) \par \par Why does the system use pointers for this ?\par Not performance. The main reason is a lot of items require stack which is a not allowed in an int handler.\par \par \par \b Where is the receiver queue in terms of address space ?\b0\par \par The receiver queue is in the applications memory space. ( and yes IPC assembly does have the right to write there) \par \par \par \b How to handle assembly references to new Messages when building strong typed clients ? \b0\par \par Apps ( especially services) should provide a seperate assembly with all the messages they require.\par \par \b Why is thread safety required for sending when its point to point ?\b0\par \par The receiver doesnt really need it , and if a sender is a single thread app or uses one thread for messaging than its not needed either. \par However there can be multiple senders and hence thread safety is an issue.\par \par \b Do receivers have a single queue or 1 queue per sender ?\b0\par \par This is under consideration. A single queue would be faster to parse for messages. Multiple queues allow for better checking of priviledged instructions and prevent some issues (DOS) .\par \par Note you can really have a single queue at the IPC level ( since it would remove the sender information. ) The question is on what to expose to the higher API.\par I suspect some sort of single priority orders Queue with messages containing the sender information after validation and checking \par but this is dependent on the app stack not IPC.\par \par \b Stack per thread ?\b0\par \par Obviously each thread has its own stack . Calling another app will get a new stack\par \par \b Why do we need a syscall and cant do a simple call? \b0\par \par You will always need to do a Syscall as the destination will have a different stack and thread and a dispatch in between would be very bad.\par \par \par \b Do we need a Monitor or Mutex on the shared memory ?\b0\par \par Neither , there can only be one writer and the compiler will prevent writes to shared objects without getting write rights first. Partial updates are possible ( but very unlikely for messages) , if this is an issue we can use some sort of update mechanism.\par \par \par \b Do we need to make a syscall to create the message and hence 2 syscalls for every message. \par \b0\par No , I dont think so. Apps can directly access shared memory. If not we would have to do something like SendMessage ( Type MessageType , params paramaters);\par \par \par \b Web Services Sample .\b0\par \par TODO \par \par \par \b In the example you show TCP and IP as seperate Why ? \b0\par \par In most OS the high IPC/Cintext switch cost means its necesity to merge things together however\par from a software management and security point of view its better to have small services. eg you could allow people \par to use Sockets or TCP but not make your own/read IP packets. While this may not be the best case it does show that \par traditional boundaries are not appropriate to a managed OS.\par \par \par \b Does the kernel actually exist ? Where is it .\b0\par \par This is almost a kerneless OS , the only thing in the kernel is the Shared Memory Heap , IPC (though its more glue between apps) and the HAL .\par \par \b Why use C# structures like queues ?\b0\par \par Basically a GC is always available ( Kernel or client) and for readability and extention purposes these structures work best. Also implementing most things as pointer lists / queues is really premature optomization. I think for the majority of these things the easy abbility to just plug in a new class and hence algorithm will outweigh the benefits of an optomized pointer algortihm . Did i mention less bugs with mature code...\par \par \b What happens if the queue needs to grow durring a syscall ?\b0\par \par This may be an issue. As the GC is the target process. We can change the GC used or it may be better to use a structure that wont do an allocate.\par \par \par \b Will a a Micro Kernel deliver reasonable performance the paper \f1\fs24 "The Cost of IPC: an Architectural Analysis" \f0\fs20 seems to say no. \par \b0\par The paper which contradicts Liedtke's earlier work is based on the following \par - Heavy message copy costs \par \tab We dont copy messages.\par - Heavy context swith cost especially by mapping memory and the TLB flush cost.\par \tab We dont remap memory and dont need to flush the TLB , more importantly the cost of a context switch is low. \par - LOts of short messages for a few pieces of data.\par Our messages are not fixed size. \par - Memory is getting slower\par We should be able to keep memory use down to a minimum.\par \par The paper does state Micro kernels deliver a more secure and reliable OS which does apply :-)\par \par \b\par Why not a pure Capability OS?\b0\par Because i dont understand it well enough. We do use capabilies though just not do the OS call on a capability. eg ProcessorCapability.Yield()\par \par \par \par \b What is a capability?\b0\par http://www.eros-os.org/essays/capintro.html\par http://en.wikipedia.org/wiki/Capability-based_security\par \par \b\par Why strongly typed messages ?\b0\par \par In no order . \par \par /// \par /// 1. Better checking of security and policy. This can be placed at the entry point of a system rather than every method. It can also be fine grained eg service xyz cant talk to service abc. Many exploits have come from lesser services (eg Blaster worm via SQL service ) , Media , and Web service then comprimising more fundamental system services like task scheduling. \par /// 2. ! APIs never need to change and service changes are backward competible. \par /// 3. ! Can invoke the destination thread waiting on a message without this code being in every method !!!\par /// 4. Can log and debug easier with a single point of entry. Exceptions can be thrown but before leaving the scope of the service ( or app) can be logged and converted to a more flexible error message\par /// 5. ! Can be prioritized , allowing high thread priority tasks to jump the queue instead of blocking on a lock etc , \par /// this really helps provide better realtime support. High priority tasks ALWAYS win not like NT and Linux. (if Subsystes use priority ) \par /// 6. Allow cross machine kernel to kernel messages \par /// 7 Encourages fewer but chunkier calls in API design between services , instead of lots of small calls.\par /// 8. Allows easier Asynch calls which provide significant performance benefits\par /// 9. While a small perf hit in simple OS loop test in more complicated scenarios performance is better and more efficient due to Asynch nature.\par /// 10. Encourages non dependent call structures\par /// 11. Internal details are hidden and can be all internal ( or private)\par /// 12. Facilitates lego block design.\par /// 13. Prevent forging of information as 1) caller doenst create security (OS does) , 2) we can clone on send. ( prevents other issues with direct call ) \par /// 14. ErrorMessages can be adjusted to suit the language and culture of the caller , while leaving the messages withing the subsystem english\par /// 15. Subsystems can be changed and restarted very easily. \par \par /// \par /// The negatives \par /// 1. Requires an API wrapper for Synch operations \par /// 2. It can be slower especially in tiny not real world benchmarks\par \tab ///\tab 3. When looking at the big picture apears more complex.\par \b\tab\par \tab API's\b0\par \tab\par \b\tab ABI Required \b0\par \tab\par \tab Send Message\par \tab\par \tab\par \tab\b Other API needed\b0 ( non kernel calls) , all within user app address\par \tab\par \tab Shared Memory\par \tab AllocateObject\par \tab CreateSharedRegion\par \tab ShareRegion\par \tab Change Owner\par \tab \par \tab GC\par \tab Alloc \par \tab DeAlloc\par \tab \par \tab \par \tab Mutex \par \tab We can use a spin lock around shared memory and hence no Kernel call. Or use cpu lock around a shared memory structure.\par \tab \par \tab \par \par \tab\par \tab\par \tab\b Internal kernel calls\b0 ( eg called from user thread but running priveledge code)\par \tab \par \tab Schedule\par \tab Save current thread \par \tab Call Scheduler \par \tab Schedule process \par \tab Block current process \par \tab \par \tab \par \tab \par \tab Kernel Message API \par \tab MM \par \tab \par \tab Scheduler \par \tab Yield \par \par \par \b Comparison with other OS\b0\par \par I believe the architectire above can deliver superior performance and be simpler , secure and more reliable than Windows and Linux. \par \par The key is in other OS calls to another Thread must be Asynch. The only thing we are doing is making all calls Asynch so MM and Sheduler suffer. That being said \par \par - Our MM commands are infrequent ( 1M page size and GC does the hard work)\par - Scheduling calls are rare. \par \par \par eg \par Windows Services Have their own thread and comms must be IPC and blocking. \par \par Other "services" such as MM and scheduler run in user threads but not always eg Part of the file system runs in user thread but whenin needs to call a device it blocks the user thread and create kernel worker threads which talk to the device driver.\par \par \par Windows Devices Drivers im not sure but prob have their own thread as the caller must block . \par \par http://msdn.microsoft.com/en-us/library/ms795837.aspx\par \par Note it is no conincidence that high performance system Web servers , SQL Servers and the File system all adopt the same strategy . That is block the caller thread and manage seperate IO threads - This design will do this by default.\par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par \par }