Virtualization and its levels

Virtualization

In computing, virtualization refers to the act of creating a virtual(rather than actual) version of something, this includes virtual computer hardware, virtual storage devices and virtual network resources.

A simple example of virtualization is when you create partitions in your hard disk. Here you are not actually breaking your hard disk into pieces but virtually you are creating that many partitions of it.

This video explains it all about the levels of Virtualization.

Levels of Virtualization

levelsOfVirtrualization

Virtualization at Instruction Set Architecture(ISA) level

  • Every machine has an instruction set .
  • This instruction set is an interface between software and hardware.
  • Using this instructions software can communicate with hardware.
  • So when virtualization is carried at this level, we create an emulator which receives all the instructions from the Virtual machines, like for example if a virtual machine wants to access the printer then that instruction will be passed to this emulator,
  • The emulator will then interpret what type of instruction it is and then map that instruction to the Host machine’s instruction and then that instruction will be carried out on Host machine and the results will be passed to the emulator and emulator will return it to the virtual machine.
  • This technique is simple to implement but as every instruction has to be interpreted before mapping it, too much time is consumed and performance becomes poor.

Virtualization at Hardware Abstraction Layer(HAL) level

  • As in Virtualization at ISA level, performance is reduced due to interpretation of every instruction so to overcome that we have virtualization at HAL level.
  • In this type we map the virtual resources with the physical resources.
  • We don’t interpret every instruction but we just check whether it is a privileged instruction or not.
  • If the instruction is not privileged we simply allow normal execution because already virtual and physical resources are mapped so accessing is simple.
  • But if the instruction is privileged, we pass the control to VMM(Virtual Machine Monitor) and it deals with it accordingly.
  • There may be many Virtual machines running simultaneously on the same Host system so if privileged instructions like memory management or scheduling tasks aren’t handled properly, system can crash.
  • Even after many advancements still there are certain exceptions which cannot be caught by this method which is a drawback of this type of virtualization.

Virtualization at Operating System(O.S.) level

  • In virtualization at HAL level each virtual machine is built from scratch i.e. by installing O.S., application suites, networking systems, etc.
  • In cloud sometimes we need to initialize 100 Virtual machines at a single time, If we use virtualization at Hardware abstraction layer(HAL) level this can take too much time.
  • So to overcome this in Virtualization at Operating system level we share operating system between Virtual machines along with the hardware.
  • So we keep the base O.S. same and install only the differences in each single Virtual machine.
  • For example if we want to install different versions of windows on virtual machines(VM), you keep base O.S. of windows same and only install the differences among each VM.
  • A drawback of this type is that you can install only those O.S. in VMs whose parent O.S.  family is same like for example you can’t install ubuntu on a VM whose base O.S. is windows.

Virtualization at Library Level or Programming language level

  • When developers develop certain applications, they save the user from all the coding details by providing them Application User Interface(API).
  • This has given a new opportunity for virtualization.
  • In this type, we use Library Interfaces to provide a different Virtual Environment(VE) for that application.
  • In short we provide user with an emulator with which user can run applications of different O.S.s.
  • Example of this is the WINE tool which was used mostly by mac users to play Counter Strike 1.6 game which was only available for windows in the start.

Virtualization at Application Layer level

  • In this kind of virtualization Virtual machines run as an application on the Host operating system.
  • We create a virtualization layer which is present above the Host Operating system and it encapsulates all the applications from the underlying O.S.
  • While all the Applications are loaded, Host O.S. provides them with a Runtime environment. But virtualization layer replaces a part of this Runtime environment and gives a Virtual Environment to the Virtualized applications.

OpenStack Cloud Architecture

  • OpenStack is a free and open source software platform for cloud computing. It is mostly deployed as infrastructure as a Service(IAaaS) where virtual servers and other resources are made available to customers.
  • OpenStack has a modular architecture where we have different modules or open source projects which are from different vendors but all this projects are connected to give us this infrastructure.

This video explains it all about the OpenStack Cloud Architecture

Conceptual Architecture

conarchi

In the conceptual architecture we can see there are 9 different components or projects and how they conceptually interact with each other is shown here.

Let us first understand what these components provide us.

Code Name Services provided
Nova Compute
Cinder Block Storage
Swift Object Storage
Glance Image
Neutron Networking
Keystone Identity management
Horizon Dashboard
Ceilometer Metering and Monitoring(Telemetry)
Heat Orchestration

Nova

  • It provides compute services i.e It provides virtual servers upon demand.
  • It automates and manages pools of compute resources.

Cinder

  • It provides Block Storage as a service for OpenStack.
  • It is designed to present storage resources to end users and these storage resources will then be used by Nova.
  • The short description of Cinder is that it virtualizes the management of block storage devices and provides end users with a self service API to request to consume those resources.

Swift

  • It provides Object Storage i.e the data is stored in the form of objects.
  • Unlike traditional filesystems here if you want to modify some object, you will have to pull that entirely out, make modifications and then push it back in.
  • You may feel that this is tedious but for data which doesn’t require much modification we can use this type of storage. For example, we can store images or videos which don’t require much modification and just by passing the objects you can load images.
  • Swift also provides replication and scalability which isn’t provided by Cinder.
  • Replication as in data is stored at different places so it can be recovered easily during system crash and scalability as in you can  scale up(increase) or scale down(decrease) your storage as per your need.

Glance

  • It provides Image Services for OpenStack.
  • The ISO images for virtual machines and metadata are stored here and they can be discovered, registered and retrieved by the users i.e you can find them and use ISO image for installing that O.S. on your virtual machine.
  • If you want to take backups of data stored on your server you can create, you can create server images i.e. copy all the data server contains and store it at multiple locations.

Neutron

  • Neutron provides networking services.
  • It is a system to manage networks and IP addresses. It provides scalability and Neutron’s services can be used through an API.
  • Users can use this API to create networks for their different user groups or different applications.

Keystone

  • Keystone is a central component for authentication and authorization.
  • Before using any of the other projects or services of OpenStack, Keystone authenticates you and authorizes you to check whether you are allowed to use that service.
  • Authentication is done using username & password credentials, token based systems,etc
  • It also provides a catalog which shows a list of all the services deployed on the cloud.

Horizon

  • It provides a dashboard using which the user can access other services easily.
  • With this dashboard you can perform most of the operations like launching a VM, assigning IP addresses and setting access controls.

Ceilometer

  • Often known as Telemetry provides metering and monitoring services.
  • It provides us data about how much physical and virtual resources are being used on the cloud.
  • Based on this data cloud providers can charge their users and also we can generate certain triggers(steps to be taken when data shows certain danger or critical condition)

Heat

  • It provides orchestration Service.
  • You need to create a template of your infrastructure and load it in heat and based on that template Heat will generate your infrastructure.
  • If you want to update your cloud by increasing some services or decreasing them, you can make changes in the template and load it in Heat and your new infrastructure will be generated.
  • Heat also provides auto scaling features i.e.  for example based on the data showed by Ceilometer if we come to know that CPU utilization is more than 70% for more than 5 minutes, we can define a trigger that will add more front end servers automatically.

Now take a look back at the conceptual architecture, you will see all the projects connected to each other and as now you know what each one of them does you will find it easy to understand.

Logical Architecture

logarchi

  • No need to get afraid by looking at the figure, these are all the projects that we discussed just now but just in depth.
  • I am going to explain this just superficially as explaining everything in depth will take hours.
  • First of all we have internet using which user can access the horizon or the dashboard.
  • Horzon provides GUI for all other services.
  • For communication between various projects or between user and projects, each project will provide one or more Http/RESTful Interfaces.
  • REST stands for representational state transfer and it is a way of providing interoperability between computer systems on the internet.
  • REST is used over SOAP because REST uses less bandwidth and hence it is suitable for internet usage.
  • For communication between different components of the same project a message queue is used.
  • At the bottom of the Logical Architecture we have keystone or Authentication and Authorization centre which authenticates i.e. checks if the user is a valid user or not and authorizes i.e. checks if the user is allowed to access that specific service or not.

 

 

CLOUD COMPUTING SERVICES

We all know that cloud technology is growing at a rapid pace and nowadays small organizations don’t want to buy and manage their own servers. So they are switching to clouds where the cloud owner will provide the user or organization with certain services.

They can be Database as a service, Security as a service, identity management as a service, etc. But in this article we will discuss about the basic and most important ones which are included in the SPI model- Software, Platform, Infrastructure as a service.

If you learn well by watching videos, here is a short video to make you understand better about this topic!

 

Let us take a simple example to understand cloud computing services better.

If we want to plan a wedding, we will require a wedding hall or a wedding ground. We will need to decorate it and plus we will need good food.

Suppose we are provided with 3 different packages.

First package– only the wedding hall

wh

If we choose this package, we will have to do decoration and catering ourselves.

Second package– wedding hall + decoration

dc

If we choose this package, we will only need to take care of the catering service.

Third package– wedding hall + decoration + caterers

ctr

If we choose this package,  all we need to do is sign a big fat cheque.

Now if we relate this to cloud services,

Iaas(Infrastructure as a service)– only hardware is provided

Paas(Platform as a service)– hardware + operating system(s) are provided

Saas(Software as a service)– hardware+operating system(s)+ applications are provided

Cloud_Services

This is a hierarchy of cloud services.

If you choose Iaas, you will need system administrators to guide you which operating system to use  based on infrastructure and your company’s needs and you will also need developers to write applications to run on those Operating systems.So this is chosen by big organizations who can afford them.

If you choose Paas, you will be provided with infrastructure and operating system so you will not be able to install your own operating system. All you need is developers to develop applications that can be deployed on that O.S. So this is chosen generally by developers.

If you choose Saas, you get infrastructure + O.S. + applications so you just need to customize the application in the initial phase and then you are good to go. This is generally chosen by end users.

 

Difference between Internal fragmentation and External fragmentation

Difference between Internal and External fragmentation

Internal Fragmentation External Fragmentation
When a process is allocated more memory than required, few space is left unused and this is called as INTERNAL FRAGMENTATION After execution of processes when they are swapped out of memory and other smaller processes replace them, many small non contiguous(adjacent) blocks of unused spaces are formed which can serve a new request if all of them are put together but as they are not adjacent to each other a new request can’t be served and this is known as EXTERNAL FRAGMENTATION.
It occurs when memory is divided into fixed-sized partitions. It occurs when memory is divided into variable-sized partitions based on size of process.
It can be cured by allocating memory dynamically or having partitions of different sizes. It can be cured by Compaction, Paging and Segmentation.

Here is a short video i made to help you understand better!

Internal Fragmentation

Screen Shot 2017-07-27 at 12.05.21 PM

  • It arises when we use fixed sized partitioning.
  • Some part of the memory is kept for operating system and the rest is available for user space.
  • In this case the user space is divided into blocks of 10 KB each.
  • When process 1 with size 8 KB is allocated a block of 10 KB, 2 KB space is left unused. When process 2 with size 10 KB is allocated a !0 KB block no space is left unused. When process 3 with size 9KB is allocated a 10 KB block, 1KB is left unused.
  • Here 2 processes are allocated space more than required and this unused space is so small to store a new process and is wasted. This is called as INTERNAL FRAGMENTATION.

External Fragmentation

Screen Shot 2017-07-27 at 12.05.47 PM

  • It arises when dynamic partitioning technique is used.
  • Here memory is allocated  to the processes dynamically based on their size.
  • So in the above example, the user space contains processes 1,2 & 3 Out of which process 1 & 3 complete their execution and are swapped out and two other processes, process 4 & 5 are swapped in their places.
  • Process 4 takes place of process 1 but as its size is only 8 KB, it is allocated only 8 KB and rest is left unused.
  • Process 5 takes place of process 3. It is allocated 6 KB space and 8 KB is left unused.
  • Now suppose a new process, process 6 wants to be swapped in and its size is 6 KB. Though we have total 6 KB space but we cannot service this request as these blocks are not contiguous(adjacent). This is called as EXTERNAL FRAGMENTATION.

SYSTEM CALLS AND ITS TYPES IN OPERATING SYSTEM

SYSTEM CALLS

Screen Shot 2017-07-13 at 3.43.34 PM

  • System calls provide an interface between user programs and operating system.
  • It is a programmatic way in which a computer program requests a service from the kernel of the operating system.

Here is a short video i made which will help you understand better.

Let us first understand the 2 modes in which a program executes.

user:kernel

  • User mode
  • Kernel mode

When a program is executing in user mode, it is not in privileged mode. So whenever it needs any hardware resource like RAM or printer, it needs to make a call to the kernel and this is known as SYSTEM CALL.

When a program is executing in kernel mode, it is executing in privileged mode. So it can access any hardware resource. So when a program needs to access any resource while it is running in user mode it makes a System Call to the kernel then a context switch occurs which takes the program from user mode to kernel mode. After the resource is accessed one more context switch occurs which takes back the program’s execution to user mode.

Now you may wonder why aren’t all programs occurring in kernel mode so we can skip the context switching. This is because if a program crashes in kernel mode entire system will be halted. So most programs are executed in user mode because if it crashes there, entire system won’t be affected.

Now let us take an example.

Screen Shot 2017-07-13 at 4.09.42 PM

 

If we want to write a program to copy the content of one file into another then, first of all this program will need the names of these files. User will give these names by either typing them in the console or selecting them by using GUI. So our program will need to make system calls to the kernel to enable it to access the input and output devices.

Also our program will need to display certain message if the program is successfully completed or even if it stops and is aborted. All these tasks require System calls.

TYPES OF SYSTEM CALLS

A) Process Control

Processes need to be controlled as in a running process must be able to halt its execution either normally or abnormally.Also one process may need to run some other process to complete its own execution. So all these system calls come under this category.

  • end, abort
  • load, execute
  • create process, terminate process
  • get process attributes, set process attributes
  • wait for time, wait event, signal event
  • allocate and free memory

B) File Management

System calls which deal with operations related to files fall under this type.

  • create file, delete file
  • open, close
  • read, write, reposition
  • get file attributes, set file attributes

C) Device Management

A process may need several resources for its execution. So system calls used for asking permission from the kernel to use those resources are included in this type.

  • request device, release device
  • read, write, reposition
  • get device attributes, set device attributes
  • logically attach or detach devices

D) Information Maintenance

We need to keep all the information up to date so these system calls help us to do that.

  • get time or date, set time or date
  • get system data, set system data
  • get process, file, or device attributes
  • set process, file, or device attributes

E) Communication

Processes need to communicate with each other for many reasons like if they need certain resource which is held by any other process. These system calls assist in doing so.

  • create, delete communication connection
  • send, receive messages
  • transfer status information
  • attach or detach remote devices

 

What is RAID? Different levels of RAID(0-6,10)

RAID- Redundant Array of Inexpensive/Independent Disks

  • RAID is the technique in which we use multiple physical hard disks which all together act as a single logical hard disk.

harddriveharddriveharddrive equaltoharddrive

  • Initially, RAID was known as Redundant Array of Inexpensive Disks because larger hard disks were costly, so we used multiple smaller disks.
  • Nowadays we use RAID to increase performance and reliability so now RAID is Redundant Array of Independent Disks.
  • Based on certain criteria how these multiple hard drives are used we have different RAID levels.

Here is a short video which i made to help you understand better!

RAID Levels

RAID 0

raid0

  • In RAID 0 all the data is striped and equally divided among the number of available disks.
  • Striping can be bitwise/byte wise/block wise.
  • Redundancy is not present as the same data is not copied anywhere else.
  • Performance is better as more than one disk participates in read/write operation.
  • Consider this example- If one man is asked to write A-Z the amount of time taken by him will be more as compared to 2 men writing A-Z because from the 2 men one man will write A-M and another will write N-Z  at the same time so this will speed up the process.
  • Hence, more the number of disks involved in read/write operation better will be the performance.
  • As there is no redundancy i.e. no copy of data is maintained reliability is low. So if any one of the disk fails we lose the data.

RAID 1

raid1

  • In RAID 1 we perform mirroring.
  • We mirror i.e make copies of all the available data.
  • This is expensive as more number of disks are required to make copies.
  • It is reliable because if one disk is lost we already have another copy of it.
  • It doesn’t help in increasing the performance.

RAID 2

raid2

  • RAID 2 uses Error Correcting code like Hamming code to restore the damaged data.
  • More than 2 bits are used for storing ECC so minimum 3 dedicated disks are required for storing ECC with 1bit on each disk.
  • Data is split bit wise across the data storing disks(Here from disk 0 – disk 3) while ECC is stored in ECC storing disks(Here disk 4 – disk 6)
  • This is expensive as more disks are required.
  • It lowers the write operation as for every write operation ECC has to be calculated which is time consuming.

RAID 3

raid3

  • In RAID 2 we used minimum 3 dedicated disks to store ECC
  • In RAID 3 we use the concept of Parity instead of ECC.
  • In this case Parity is XOR values of A1,A2 and A3. If they contain even number of 1s then Parity is set to 0 and if they contain odd number of 1s then Parity is set to 1.
  • Only 1 bit is used so this requires only 1 dedicated disk.
  • RAID 3 increases the performance as all the disks participate in every read/write operation.
  • It doesn’t increase number of simultaneous accesses as all the disks participate in each read/write operation so no disk is available to service other read/write request.

RAID 4

raid4

  • RAID 4 is similar to RAID 3. The only difference is that in RAID 3 we split the data bit wise but in RAID 4 we split the data block wise.
  • As blocks are bigger than bits, so smaller read/write operations involve only one disk. So other disks are free to service other read/write requests so RAID 4 increases the number of simultaneous accesses.
  • As only one disk is involved performance is not increased.
  • It is mainly suitable for larger read/write operations in which many blocks of data will have to be either read or written, so more than one disk will be involved which will boost the performance.

RAID 5

raid5

  • In RAID 3 and RAID 4 one dedicated disk was used to store the parity information. If this disk fails we will lose our entire backup.
  • To overcome this flaw in RAID 5 we distribute the parity data evenly among all the disks.
  • Any formula can be used to distribute it evenly.(example- as we have 5 disks here parity of nth block will be stored in n(mod 5)+1 disk.)
  • Here one parity bit per block is stored in a disk chosen by the formula and the actual data is divided among the other disks.
  • This reduces the potential overuse of a single disk.
  • As only one bit of parity is stored we can overcome failures if only 1 disk fails, if more than 1 disk fails RAID 5 cannot help in recovering the damaged data.

RAID 6

raid6

  • RAID 6 is similar to RAID 5.
  • In  RAID 5 we used only one parity bit but in RAID 6 we use more than one parity bit.
  • This extra bit contains extra redundant information which helps to recover from multiple disk failures.

RAID 10

  • This is the hybrid of  RAID 0 and RAID 1
  • So it gives better performance as well as reliability.
  • RAID 0 – Striping of data
  • RAID 1- Mirroring of data

RAID 0+1

raid0+1m

  • Here first we mirror the available data and form copies of it.
  • Then we stripe those copies.

RAID 1+0

raid1+0

  • Here we first stripe the data.
  • Then we mirror the striped data to form copies.

What is ARP and ARP Spoofing(ARP cache poisoning)?

Here is a short video i made to help you understand better!

ARP

ARP-Address Resolution Protocol

layers

  • As we can see in the above image, the second layer i.e. the Data Link Layer is the layer in which ARP protocol is used.
  • ARP is used to find the MAC address of a machine in LAN whose IP address is known.
  • As in Data Link Layer IP addresses are not much useful MAC addresses are used for communication within a LAN.

Steps involved:

  • ARP Request
  • ARP Response(ARP Reply)

ARP Table: Every machine in a LAN maintains an ARP Table which consists of two columns viz. IP address and MAC address. For faster communication MAC addresses are maintained for corresponding IP addresses so that every time ARP Request is not sent.

Explanation with example:

ar1

ARP Request

  • In the above image, suppose machine A wants to communicate with machine B. Machine A will check its ARP table if it doesn’t have machine B’s MAC address then A will send an ARP Request.
  • As B’s MAC address is not known this will be a broadcast message and as it is a broadcast message Target MAC address will contain all 0’s as highlighted above.

ARP Response

  • As all the machines receive the ARP Request sent by A, they will compare their IP address with the Target IP.
  • Whose IP address matches with Target IP will send an ARP Response to A.
  • As in this case B’s IP matches with the Target IP , B will send an ARP Response to A which will contain its MAC address , A will update its ARP Table and thus communication will be established.

ARP Spoofing (ARP cache poisoning)

  • Advantage of ARP is that it is simple to implement.
  • Disadvantage of ARP is that it doesn’t involve authentication i.e. it doesn’t check that ARP Response is received from a valid source or not.
  • This flaw is exploited by a technique we call as ARP spoofing or ARP cache poisoning.
  • In ARP spoofing we poison the ARP cache or ARP Table by making a machine add wrong details in their  ARP Table

Explanation with example:

ar2

This is a Man-in-the-middle attack using ARP spoofing.

  • Let’s consider that Bob wants to communicate with Alice but he doesn’t have Alice’s MAC address. So Bob will send an ARP Request which will be a broadcast message.
  • So along with Alice, attacker will also receive this message as attacker is also present in the LAN. Attacker will send a fake crafted ARP reply to Bob with Alice’s IP address and its MAC address.
  • Bob will update its ARP Table where it will map Alice’s IP address with Attacker’s MAC address (as there is no authentication) and Bob will send all the packets to Attacker instead of Alice.
  • The same process is repeated with Alice where Alice is made to map Bob’s IP address with Attacker’s MAC address and thus a Man-in-the-middle attack is carried out using ARP spoofing.