1
UNIT – 3 Cloud Management & Virtualization Technology Unit-03/Lecture-01 Virtualization Technologies The dictionary includes many definitions for the word “cloud.” A cloud can be a mass of water droplets, gloom, an obscure area, or a mass of similar particles such as dust or smoke. When it comes to cloud computing, the definition that best fits the context is “a collection of objects that are grouped together.”It is that act of grouping or creating a resource pool that is what succinctly differentiates cloud computing from all other types of networked systems. Not all cloud computing applications combine their resources into pools that can be assigned on demand to users, but the vast majority of cloud-based systems do. The benefits of pooling resources to allocate them on demand are so compelling as to make the adoption of these technologies a priority. Without resource pooling, it is impossible to attain efficient utilization, provide reasonable costs to users, and proactively react to demand. In this chapter, you learn about the technologies that abstract physical resources such as processors, memory, disk, and network capacity into virtual resources. When you use cloud computing, you are accessing pooled resources using a technique called virtualization. Virtualization assigns a logical name for a physical resource and then provides a pointer to that physical resource when a request is made. Virtualization provides a means to manage resources efficiently because the mapping of virtual resources to physical resources can be both dynamic and facile. Virtualization is dynamic in that the mapping can be assigned based on rapidly changing conditions, and it is facile because changes to a mapping assignment can be nearly instantaneous. These are among the different types of virtualization that are characteristic of cloud computing: • Access: A client can request access to a cloud service from any location. • Application: A cloud has multiple application instances and directs requests to an instance based on conditions. • CPU: Computers can be partitioned into a set of virtual machines with each machine being assigned a workload. Alternatively, systems can be virtualized through loadbalancing technologies. • Storage: Data is stored across storage devices and often replicated for redundancy. To enable these characteristics, resources must be highly configurable and flexible. You can define the features in software and hardware that enable this flexibility as conforming to one or more of the following mobility patterns: patterns: • P2V: Physical to Virtual • V2V: Virtual to Virtual • V2P: Virtual to Physical • P2P: Physical to Physical • D2C: Datacenter to Cloud • C2C: Cloud to Cloud • C2D: Cloud to Datacenter • D2D: Datacenter to Datacenter The techniques used to achieve these different types of virtualization are the subject of this chapter. According to Gartner (“Server Virtualization: One Path that Leads to Cloud Computing,” by Thomas J. Bittman, 10/29/2009, Research Note G00171730), virtualization is a key enabler of the first four of five key attributes of cloud computing:
2 • Service-based: A service-based architecture is where clients are abstracted from service providers through service interfaces. • Scalable and elastic: Services can be altered to affect capacity and performance on demand. • Shared services: Resources are pooled in order to create greater efficiencies. • Metered usage: Services are billed on a usage basis. • Internet delivery: The services provided by cloud computing are based on Internet protocols and formats. Load Balancing and Virtualization One characteristic of cloud computing is virtualized network access to a service. No matter where you access the service, you are directed to the available resources. The technology used to distribute service requests to resources is referred to as load balancing. Load balancing can be implemented in hardware, as is the case with F5's Big IP servers, or in software, such as the Apache mod_proxy_balancer extension, the Pound load balancer and reverse proxy software, and the Squid proxy and cache daemon. Load balancing is an optimization technique; it can be used to increase utilization and throughput, throughput, lower latency, reduce response time, and avoid system overload. The following network resources can be load balanced: • Network interfaces and services such as DNS, FTP, and HTTP • Connections through intelligent switches • Processing through computer system assignment • Storage resources • Access to application instances Without load balancing, cloud computing would very difficult to manage. Load balancing provides the necessary redundancy to make an intrinsically unreliable system reliable through managed redirection. It also provides fault tolerance when coupled with a failover mechanism. Load balancing is nearly always a feature of server farms and computer clusters and for high availability applications. A load-balancing system can use different mechanisms to assign service direction. In the simplest load-balancing mechanisms, the load balancer listens to a network port for service requests. When a request from a client or service requester arrives, the load balancer uses a scheduling algorithm to assign where the request is sent. Typical scheduling algorithms in use today are round robin and weighted round robin, fastest response time, least connections and weighted least connections, and custom assignments based on other factors. A session ticket is created by the load balancer so that subsequent related traffic from the client that is part of that session can be properly routed to the same resource. Without this session record or persistence, a load balancer would not be able to correctly failover a request from one resource to another. Persistence can be enforced using session data stored in a database and replicated across multiple load balancers. Other methods can use the client's browser to store a client-side cookie or through the use of a rewrite engine that modifies the URL. Of all these methods, a session cookie stored on the client has the least amount of overhead for a load balancer because it allows the load balancer an independent selection of resources. The algorithm can be based on a simple round robin system where the next system in a list of systems gets the request. Round robin DNS is a common application, where IP addresses are assigned out of a pool of available IP addresses. Google uses round robin DNS.
3
Unit-03/Lecture-02
Resiliency "Resilient computing is a form of failover that distributes redundant implementations of IT resources across physical locations. IT resources can be pre-configured so that if one becomes deficient, processing is automatically handed over to another redundant IT resource. Within cloud computing, the characteristic of Resiliency c an refer to redundant IT resources within the same cloud (but in different physical locations) or across multiple clouds. Cloud consumers can increase the reliability and availability of their applications by leveraging the resiliency of cloud-based IT resources.” Resilient computing is a form of failover that distributes redundant implementations of IT resources across physical locations. IT resources can be pre-configured so that if one becomes deficient, processing is automatically handed over to another redundant implementation. Within cloud computing, the characteristic of resiliency can refer to redundant IT resources within the same cloud (but in different physical locations) or across multiple clouds. Cloud consumers can increase both the reliability and availability of their applications by leveraging the resiliency of cloud-based IT resources.
Figure 1 - A resilient system in which Cloud B hosts a redundant implementation of Cloud Service A to provide failover in case Cloud Service A on Cloud A becomes unavailable.
Cloud Provisioning[RGPV/Dec2013 (7)] The ServiceNow Cloud Provisioning application facilitates the provisioning and management of virtual machines (VM) within a company's infrastructure. Cloud provisioning delivers the key benefits of private (VMware) and public (Amazon EC2) virtual machine management in a single application that is fully integrated with ServiceNow. ServiceNow provides process and service automation with orchestration, approvals, and service catalog capabilities. ServiceNow can package and deliver infrastructure elements, such as servers, networks, and storage, to end-users through the service catalog. These virtual resources can then be requested through a self-service portal, provisioned automatically, and managed directly by the requester. The ServiceNow Cloud Provisioning application offers the following capabilities:
Abstraction of virtualization systems: Virtual machine users are not required to know the details of the specific virtualization system. This allows use of a single
4
interface to manage virtual resources in public and private clouds: VMware and Amazon EC2. Reuse of virtual machine configurations: ServiceNow uses VMware templates and Amazon EC2 images to create reusable catalog items in a wide range of sizes that users can select from the service catalog. Improved service catalog interface: Requesting the right virtual machine for the job is quick and easy in the improved services interface. Role-based access: Role-based security ensures that users have the proper privileges for viewing, creating, and managing virtual resources. Dedicated service portals: ServiceNow users view their virtual resources and request changes in a dedicated portal. Administrative and operational users manage virtual machines, provisioning tasks, and SLAs from portals that grant role-based access to virtual resources. Controlled lease duration: Default end dates for virtual machine leases are applied automatically to all requests. Lease duration controls prevent unused virtual machines from persisting past their intended use date. Automatic cost adjustment: Modifications to virtual resources that are subject to cost adjustments are recalculated automatically when the change is requested. Fully integrated with the Service Now platform: Approvals, notifications, security, asset management, and compliance capabilities are all integrated into virtual resource management processes.
How Cloud provisioning works Cloud provisioning tasks are performed by users who are members of virtual provisioning groups. The entire process from configuration to provisioning, and eventually to service catalog requests for virtual resources, is controlled by members of these groups. This diagram shows how the process flow works:
All required tasks within cloud provisioning are performed by members of these groups:
Virtual Provisioning Cloud Administrator: Members of this group own the cloud provisioning environment and are responsible for configuring the different virtualization providers used by cloud provisioning. Cloud administrators can create service catalog items from VMware templates and Amazon EC2 images, approve requests for virtual machines, and monitor the cloud provisioning environment using the Service Monitoring Portal.
5
Virtual Provisioning Cloud Operator: Members of this group fulfill provisioning requests from users. Cloud operators perform the day-to-day work of cloud provisioning by completing tasks that appear in the Cloud Operations Portal. Cloud operators are assigned to specific virtualization providers and must be technically adept with the providers they support. Virtual Provisioning Cloud User: Members of this group can request virtual machines from the service and use the My Virtual Assets portal to manage any virtual machines that are assigned to them.
S.NO 1
RGPV QUESTIONS Discuss the benefits, characteristics and goals of provisioning.
Year Dec 2013
Marks 7
6
Unit-03/Lecture-03
Asset management Hardware, software and digital assets can be managed from within the cloud. For example, suppliers offer the ability to view and manage your company's PCs - whether connected to your own internal network or not - and judge whether they're security compliant or need a software upgrade. Images, video and audio files can also be shared and managed online using Digital Asset Management (DAM) solutions. Tools for managing software lifecycles can also help to best plan rocurement or lease renewal. Cloud computing has changed the economics of IT. Capital expenditure (CAPEX) is required to build IT infrastructure. Because organizations hire and use resources from Cloud service providers, they will see more of Operational Expenditure (OPEX). The Cloud provides various cost saving options: • Infrastructure cost: If an organization needs to build a large-scale system, they may need to invest in buying hardware (servers, storage, routers), software licensing, etc., which involves high upfront cost (CAPEX). With Cloud, IT infrastructure investment is minimized. • Management cost: Lack of in-house IT infrastructure minimizes the people cost associated with the management of those infrastructures. • Power and Energy cost: Power consumption has become a concern for most organizations because energy costs continue to rise. The organizations that use Cloud applications and services save on power and energy use. An increase in energy Consider a scenario where an organization wants to run its business-critical applications using 1000 servers to meet the desired service levels. They have two options to consider: the first option is to set up an on-site infrastructure for running 1000 servers and the second one is to hire 1000 instances of servers on an Amazon EC2 Cloud. Let us consider various cost components involved in both these options:In the first option, to set up an on-site infrastructure, the organization would require capital investment for purchasing server, storage, and network hardware, together with additional expenses for hardware maintenance, licensing OSs, power and cooling options, building data center infrastructure, administrative costs, and data transfer. In contrast to that, the second option involves only two cost components: the major cost on instance usage and a minor cost on data transfer. The diagram displayed on this slide—sourced from Amazon.com—shows that the first option incurs 0 times more TCO, compared to the second option. This clearly illustrates the economic benefit of Cloud, compared to an on-site infrastructure. Concept of Map Reduce Cloud computing is designed to provide on demand resources or services over the internet, usually at the scale and with the reliability level of a data center. Map Reduce is a software framework that allows developers to write programs that process massive amounts of unstructured data in parallel across a distributed cluster of processors or stand-alone computers. It was developed at Google for indexing Web pages. The model is inspired by the map and reduces functions commonly used in functional programming although their purpose in the Map Reduce framework is not the same as their original forms. Structure of MapReduce Framework
7 The framework is divided into two parts: Map: It distributes out work to different nodes in the distributed cluster. Reduce: It collects the work and resolves the results into a single value.
The MapReduce Framework is fault-tolerant because each node in the cluster is expected to report back periodically with completed work and status updates. If a node remains silent for longer than the expected interval, a master node makes note and re-assigns the work to other nodes.
8
Unit-03/Lecture-04
Cloud Computing Governance Cloud computing is an emerging technology that may help enterprises meet the increased requirements of lower total cost of ownership (TCO), higher return on investment (ROI), increased efficiency, dynamic provisioning and utility-like pay-as-you-go services. However, many IT professionals are citing the increased risks associated with trusting information assets to the cloud as something that must be clearly understood and managed by relevant stakeholders. Cloud Computing marks the decrease in emphasis on 'systems' and the increase in emphasis on 'data'. With this trend, Cloud Computing stakeholders need to be aware of the best practices for governing and operating data and information in the Cloud. In this Working Group, we will focus on two key phases of Data Governance research: Survey Phase- Understanding the top requirements and needs of different stakeholders on governing and operating data in the Cloud, and Best Practices Recommendation Phase- Prioritizing and answering of the key problems and questions identified by Cloud stakeholders in Phase 1. With the exponential increase in data deposited in Cloud environments (both public and private), research in the area of data, information and knowledge stored and processed in the Cloud is timely. Data is stored in many different forms, and processed in a myriad of methods. There is a need for an authoritative voice in making sense of the key concerns with data storage and processing techniques. There is also an urgent requirement in aligning current practices with governance, risk and compliance regulations. As the leading non-profit organization educating and promoting vendor-neutral best practices for Cloud Computing Security, the CSA is in the ideal position for this data-oriented research. Our team is led by CSA Singapore and CSA Silicon Valley chapters, in collaboration with CSA Global. The editorial team and committee consists of a mix of practitioners and researchers in the area of Cloud data protection and governance. The Working Group aims to publish a white paper featuring different stakeholder groups' concerns over a quarterly publication cycle. Together, these white papers will contribute to the main CSA Guidance documentation's future versions.
The Enterprise Cloud Governance Solution As you extend your public and private cloud deployments, cloud governance is becoming a larger issue. Cloud governance is essential for enterprises to maintain control over increasingly complex and integrated systems, services and human r esources environments. Dell Cloud Manager helps you meet your cloud governance needs with a unified management solution that leverages your internal management systems. These are some of the governance issues that Dell Cloud Manager resolves: Access Controls – Limit access for internal or external teams to specific resources within one or more clouds. Our flexible role-based security allows you to allocate specific levels of access to development, QA and other teams. Integrate into your LDAP/AD deployment to extend your internal policies into your clouds. Financial Controls – Track and limit spending by project code, customer or department. Each time a new resource is provisioned across your clouds, Dell Cloud Manager will track the cost and limit the spending, per your specific budget requirements.
9 Key Management and Encryption - Our patent-pending security architecture enforces a separation of roles. Dell Cloud Manager, running outside the cloud, is the guardian of your security keys and credentials, but has no access to your data. Your cloud provider has your encrypted data, but not the encryption keys. Logging and Auditing – Dell Cloud Manager logs all activity across your clouds. Track activity by user through reports or by integrating monitoring and management systems. High Availability & Disaster Recovery for the Cloud The infamous 5 nines (99.999%) up-time goal for systems is the "IT holy grail". High Availability (HA) is about enabling your systems to protect against and automatically recover from failures. In a geographically-dispersed cloud, HA & DR take on a whole new meaning. We assist you to design meaningful HA & DR solutions using simple yet powerful constructs, with off-the-shelf technology solutions in the marketplace. There is enough confusion in IT industry about the distinction between HA and Disaster Recovery (DR). We define HA has business and application continuance. DR is defined as the "must have" coverage when the HA configuration is completely unavailable (natural disaster or any unusual events). For example, if an application/database is part of an HA environment, the minimum number of servers required is 2. When DR is added to an HA environment, the minimum number of servers required is 3. In the end, HA & DR is all about your requirements and your environment. For your mission-critical systems, you will ALWAYS need both HA and DR, as one does not obviate the need for another The High availability and Disaster Recovery Cloud Services (HA & D/R Cloud) provide a way for clients to acquire high availability or disaster recover computer capacity without having to own their own system for this. The HA & D/R Cloud can complement the Cloud services or be used as an offsite solution for clients who owns their own local systems. HA Cloud The HA Cloud is basically a mirroring service where all the data is mirrored to a system located in the Data centre. The mirrored system will contain exactly the same data as the production system. D/R Cloud The D/R Cloud is a service where Iptor is providing an empty partition only running the system software. In case of disaster the capacity of the partition can be increased and the clients system can be installed in the partition. The HA and D/R Cloud are hosted at Iptors secure data center in Linköping, Sweden High Availability Do you want low or no downtime during planed maintenance of your production system? Do you want to reduce the number of disasters that can lead to downtime? Disaster Recovery
Do you want a failsafe for unexpected incidents? Do you want an easy and flexible solution for securing your most important data in case of disaster?
10
Unit-03/Lecture-05 Virtualization abstracts physical resources, such as compute, storage, and network, to function as logical resources. It creates an abstraction layer to hide the physical characteristics of resources from users. For example, in computer system virtualization, a Virtualization abstracts physical resources, such as compute, storage, and network, to function as logical resources. It creates an abstraction layer to hide the physical characteristics of resources from users. For example, in computer system virtualization, a physical machine appears as multiple logical machines (virtual machines), each running an operating system concurrently. A VDC is a data center in which compute, storage, network, and/or applications are virtualized. Compute virtualization enables running multiple operating systems concurrently on a compute system. This improves compute system utilization. Storage virtualization provides a logical view of storage and presents it to the computer system. In network virtualization, multiple logical networks are created on a physical network. Each of these virtualization technologies is explained in detail in the forthcoming modules. By consolidating IT resources using virtualization techniques, organizations can optimize their infrastructure utilization. By improving the utilization of IT assets, organizations can reduce the costs associated with purchasing new hardware. They also reduce space and energy costs associated with maintaining the resources. Moreover, less people are required to administer these resources, which further lower the cost. Virtual resources are created using software that enables faster deployment, compared to deploying physical resources. Virtualization increases flexibility by allowing creating and reclaiming the logical resources based on business requirements. Virtualization Technology Virtualization is the process of converting a physical IT resource into a virtual IT resource. Most types of IT resources can be virtualized, including: Servers - A physical server can be abstracted into a virtual server. Storage - A physical storage device can be abstracted into a virtual storage device or a virtual disk. Network - Physical routers and switches can be abstracted into logical network fabrics, such as VLANs. Power - A physical UPS and power distribution units can be abstracted into what are commonly referred to as virtual UPSs.
The first step in creating a new virtual server through virtualization software is the allocation of physical IT resources, followed by the installation of an operating system. Virtual servers use their own guest operating systems, which are independent of the operating system in which they were created. Both the guest operating system and the application software running on the virtual server are unaware of the virtualization process, meaning these virtualized IT resources are installed and executed as if they were running on a separate physical server. This uniformity of execution that allows programs to run on physical systems as they would on virtual systems is a vital characteristic of virtualization. Guest operating systems typically require seamless usage of software products and applications that do not need to be customized, configured, or patched in order to run in a virtualized environment. Virtualization software runs on a physical server called a host or physical host , whose underlying hardware is made accessible by the virtualization software. The virtualization
11 software functionality encompasses system services that are specifically related to virtual machine management and not normally found on standard operating systems. This is why this software is sometimes referred to as a virtual machine manager or a virtual machine monitor (VMM), but most commonly known as a hypervisor . Hardware-based and Operating System-based Virtualization Operating System-based Virtualization Operating system-based virtualization is the installation of virtualization software in a preexisting operating system, which is called the host operating system (Figure 1). For example, a user whose workstation has a specific version of Windows installed decides it wants to generate virtual machines. It installs the virtualization software into its host operating system like any other program and uses this application to generate and operate one or more virtual machine. This user needs to use its virtualization software to enable direct access to any of the generated virtual machines. Since the host operating system can provide hardware devices with the necessary support, operating system virtualization can rectify hardware compatibility issues even if the hardware driver is unavailable to the virtualization software. Hardware independence that is enabled by virtualization allows hardware IT resources to be more flexibly used. For example, let's take a scenario in which the host operating system has the software necessary for controlling five network adapters that are available to the physical computer. The virtualization software can make the five network adapters available to the virtual machine, even if the virtualized operating system is usually incapable of physically housing five network adapters.
Figure 1 - The different logical layers of operating system-based virtualization, in which the VM is first installed into a full host operating system and subsequently used to generate virtual machines.
Virtualization software translates hardware IT resources that require unique software for operation into virtualized IT resources that are compatible with a range of operating systems. Since the host operating system is a complete operating system in itself, many operating system-based services that are available as organizational management and administration tools can be used to manage the virtualization host. Examples of such services include: Backup and Recovery Integration to Directory Services Security Management
Operating system-based virtualization can introduce demands and issues related to performance overhead, such as: The host operating system consumes CPU, memory, and other hardware IT resources.
12
Hardware-related calls from guest operating systems need to traverse several layers to and from the hardware, which decreases overall performance. Licenses are usually required for host operating systems, in addition to individual licenses for each of their guest operating systems.
A concern with operating system-based virtualization is the processing overhead required to run the virtualization software and host operating systems. Implementing a virtualization layer will negatively affect overall system performance. Estimating, monitoring, and managing the resulting impact can be challenging because it requires expertise in system workloads, software and hardware environments, and sophisticated monitoring tools. Hardware-based Virtualization This option represents the installation of virtualization software directly on the virtualization host hardware so as to bypass the host operating system, which would presumably be engaged with operating system-based virtualization (Figure 2). Allowing the virtual machines to interact with hardware without requiring intermediary action from the host operating system generally makes hardware-based virtualization more efficient.
Figure 2 - The different logical layers of hardware-based virtualization, which does not require another host operating system.
Virtualization software is typically referred to as a hypervisor for this type of processing. A hypervisor has a simple user interface that requires a negligible amount of storage space. It exists as a thin layer of software that handles hardware management functions to establish a virtualization management layer. Device drivers and system services are optimized for the provisioning of virtual machines, although many standard operating system functions are not implemented. This type of virtualization system is essentially used to optimize performance overhead inherent to the coordination that enables multiple VMs to interact with the same hardware platform. One of the main issues of hardware-based virtualization concerns compatibility with hardware devices. The virtualization layer is designed to communicate directly with the host hardware, meaning all of the associated device drivers and support software must be compatible with the hypervisor. Hardware device drivers may not be as available to hypervisor platforms as they are to more commonly used operating systems. Also, host management and administration features may not include the range of advanced functions that are common to operating systems.
13
UNIT-03/LECTURE 6 Storage networking The core elements include compute (server), storage, network, application, and DBMS. Application: An application is a computer program that provides the logic for computing operations. Applications may use a DBMS, which uses operating system services to perform store/retrieve operations on storage devices. DBMS: DBMS provides a structured way to store data in logically organized tables that are interrelated. A DBMS optimizes the storage and retrieval of data. Compute: Compute is a physical computing machine that runs operating systems, pplications, and databases. Storage: Storage refers to a device that stores data persistently for subsequent use. Network: Network is a data path that facilitates communication between clients and compute systems or between compute systems and storage. These core elements are typically viewed and managed as separate entities. But, all these elements must work together to address data processing requirements. Other elements of a CDC are power supplies and environmental controls, such as air conditioning and fire suppression.Data is accessed and stored by an application using the underlying infrastructure. The key components of this infrastructure are operating system (or File system),Connectivity (network) and Storage itself. The storage device can be internal and (or) external to the compute system. In either case, the host controller card inside the compute systems accesses the storage devices using pre-defined protocols such as IDE/ATA, SCSI, or Fibre Channel. IDE/ATA and SCSI are popularly used in small and personal computing environment for accessing internal storage. Fibre Channel and iSCSI protocols are used for accessing data from an external storage device (or subsystems). External storage devices can be connected to the compute systems directly or through a storage network. Data can be accessed over a network in one of the following ways – Block level or file level.
Direct Attached Storage (DAS) is an architecture where storage is connected directly to
14 compute systems. Applications access data from DAS at a block level. DAS is classified as internal or external based on the location of the storage device with respect to the compute system. Internal DAS: In internal DAS architectures, the storage device is located within the compute system and is connected to the compute system by a serial or parallel bus. The internal disk drive of a compute system is an example of internal DAS. External DAS: In external DAS architectures, the storage device is located outside the compute system. In most cases, communication between the compute system and the storage device takes place over an SCSI or FC protocol. Tape libraries and directly connected external disk drive packs are some examples of external DAS. External DAS overcomes the distance and device count limitations of internal DAS. FC SAN is a high-speed, dedicated network of compute systems and shared storage devices.FC SAN uses SCSI over FC protocol to transfer data between compute systems and storage devices. It provides block-level access to storage devices. The Fibre Channel Fabric is a logical space in which all nodes communicate with one another through an FC switch or multiple interconnected FC switches. If an FC Fabric involves multiple switches, they are linked together through an FC cable. In a switched fabric, the link between any two switches is called Inter Switch Link (ISL). Desktop virtualization It is a software technology that separates the desktop e nvironment and associated application software from the physical client device that is used to access it. Desktop virtualization can be used in conjunction with application virtualization and (Windows) user profile management systems, now termed "user virtualization," to provide a comprehensive desktop environment management system. In this mode, all the components of the desktop are virtualized, which allows for a highly flexible and much more secure desktop delivery model. In addition, this approach supports a more complete desktop disaster recovery strategy as all components are essentially saved in the data center and backed up through traditional redundant maintenance systems. If a user's device or hardware is lost, the restore is much more straightforward and simple, because basically all the components will be present at login from another device. In addition, because no data is saved to the user's device, if that device is lost, there is much less chance that any critical data can be retrieved and compromised. Below are more detailed descriptions of the types of desktop virtualization technologies that will be used in a typical deployment . Application virtualization It is a software technology that encapsulates application software from the underlying operating system on which it is executed. A fully virtualized application is not installed in the traditional sense, [1] although it is still executed as if it were. The application behaves at runtime like it is directly interfacing with the original operating system and all the resources managed by it, but can be isolated or sandboxed to varying degrees. In this context, the term "virtualization" refers to the artifact being encapsulated (application), which is quite different from its meaning in hardware virtualization, where it refers to the artifact being abstracted (physical hardware).
15
UNIT-03/LECTURE 7 Virtualization benefits: [RGPV/Dec2013 (7)]
Allows applications to run in environments that do not suit the native application: e.g. Wine allows some Microsoft Windows applications to run on Linux. e.g. CDE, a lightweight application virtualization, allow s Linux applications to run in a distribution agnostic way. May protect the operating system and other applications from poorly written or buggy code and in some cases provide memory protection and IDE style debugging features, for example as in the IBM OLIVER. Uses fewer resources than a separate virtual machine. Run applications that are not written correctly, for example applications that try to store user data in a read-only system-owned location. Run incompatible applications side-by-side, at the same time and with minimal regression testing against one another. Reduce system integration and administration costs by maintaining a common software baseline across multiple diverse computers in an organization. Implement the security principle of least privilege by removing the requirement for endusers to have Administrator privileges in order to run poorly written applications. Simplified operating system migrations. Improved security, by isolating applications from the operating system. Allows applications to be copied to portable media and then imported to client computers without need of installing them, so called Portable software.
Server Virtualization Server virtualization is the masking of server resources, including the number and identity of individual physical servers, processors, and operating systems, from server users. The server administrator uses a software application to divide one physical server into multiple isolated virtual environments. The virtual environments are sometimes called virtual private servers, but they are also known as guests, instances, containers or emulations. There are three popular approaches to server virtualization: the virtual machine model, the paravirtual machine model, and virtualization at the operating system (OS) layer. Virtual machines are based on the host/guest paradigm. Each guest runs on a virtual imitation of the hardware layer. This approach allows the guest operating system to run without modifications. It also allows the administrator to create guests that use different operating systems. The guest has no knowledge of the host's operating system because it is not aware that it's not running on real hardware. It does, however, require real computing resources from the host -- so it uses a hypervisor to coordinate instructions to the CPU. The hypervisor is called a virtual machine monitor (VMM). It validates all the guest-issued CPU instructions and manages any executed code that requires addition privileges. VMware and Microsoft Virtual Server both use the virtual machine model. The paravirtual machine (PVM) model is also based on the host/guest pa radigm -- and it uses a virtual machine monitor too. In the paravirtual machine model, however, The VMM actually modifies the guest operating system's code. This modification is called porting. Porting supports the VMM so it can utilize privileged systems calls sparingly. Like virtual machines, paravirtual machines are capable of running multiple operating systems. Xen and UML both use the paravirtual machine model. Virtualization at the OS level works a little differently. It isn't based on the host/guest paradigm.
16 In the OS level model, the host runs a single OS kernel as its core and exports operating system functionality to each of the guests. Guests must use the same operating system as the host, although different distributions of the same system are allowed. This distributed architecture eliminates system calls between layers, which reduces CPU usage overhead. It also requires that each partition remain strictly isolated from its neighbors so that a failure or security breach in one partition isn't able to affect any of the other partitions. In this model, common binaries and libraries on the same physical machine can be shared, allowing an OS level virtual server to host thousands of guests at the same time. Virtuozzo and Solaris Zones both use OS-level virtualization. Server virtualization can be viewed as part of an overall virtualization trend in enterprise IT that includes storage virtualization, network virtualization, and workload management. T his trend is one component in the development of autonomic computing, in which the server environment will be able to manage itself based on perceived activity. Server virtualization can be used to eliminate server sprawl, to make more efficient use of server resources, to improve server availability, to assist in disaster recovery, testing and development, and to centralize server administration. Common Uses of Server Virtualization One common usage of this technology is in Web servers. Using virtual Web servers is a popular way to provide low-cost Web hosting services. Instead of requiring a separate computer for each Web server, dozens of virtual servers can co-reside on the s ame computer. Benefits of Server Virtualization Server virtualization has many benefits. For example, it lets each virtual server run its own operating system and each virtual server can also be independently rebooted of one another. Server virtualization also reduces costs because less hardware is required so that alone saves business money. S.NO 1
RGPV QUESTIONS Explain what you understand by Hypervisor management software and Infrastructure Requirements.
Year Dec 2013
Marks 7
17
UNIT-03/LECTURE 8 Storage virtualization is a concept and term used within computer science. S pecifically, storage systems may use virtualization concepts as a tool to enable better functionality and more advanced features within and across storage systems. Broadly speaking, a 'storage system' is also known as a storage array or Disk array or a filer. Storage systems typically use special hardware and software along with disk drives in order to provide very fast and reliable storage for computing and data processing. Storage systems are complex, and may be thought of as a special purpose computer designed to provide storage capacity along with advanced data protection features. Disk drives are only one element within a storage system, along with hardware and special purpose embedded software within the system. Storage systems can provide either block accessed storage, or file accessed storage. Block access is typically delivered over Fibre Channel, iSCSI, SAS, FICON or other protocols. File access is often provided using NFS or CIFS protocols. Within the context of a storage system, there are two primary types of virtualization that can occur:
Block virtualization used in this context refers to the abstraction (separation) of logical storage (partition) from physical storage so that it may be accessed without regard to physical storage or heterogeneous structure. This separation allows the administrators of the storage system greater flexibility in how they manage storage for end users.[1] File virtualization addresses the NAS challenges by eliminating the dependencies between the data accessed at the file level and the location where the files are physically stored. This provides opportunities to optimize storage use and server consolidation and to perform non-disruptive file migrations.
File Level Storage vs. Block Level Storage The two most popular storage system technologies are file level storage and block level storage. File level storage is seen and deployed in Network Attached Storage (NAS) systems. Block level storage is seen and deployed in Storage Area Network (SAN) storage. In the article below, we will explain the major differences between file level storage vs. block level storage.
File Level Storage - This storage technology is most commonly used for storage systems, which is found in hard drives, NAS systems and so on. In this File Level storage, the storage disk is configured with a protocol such as NFS or SMB/CIFS and the files are stored and accessed from it in bulk.
The File level storage is simple to use and implement. It stores files and folders and the visibility is the same to the clients accessing and to the system which stores it. This level storage is inexpensive to be maintained, when it is compared to its counterpart i.e. block level storage. Network attached storage systems usually depend on this file level storage.
18
File level storage can handle access control, integrate integration with corporate directories; and so on.
Block Level Storage - In this block level storage, raw volumes of storage are created and each block can be controlled as an individual hard drive. These Blocks are controlled by server based operating systems and each block can be individually formatted with the required file system.
Block level storage is usually deployed in SAN or storage area network environment. This level of storage offers boot-up of systems which are connected to them. Block level storage can be used to store files and can work as storage for special applications like databases, Virtual machine file systems and so on. Block level storage data transportation is much efficient and reliable. Block level storage supports individual formatting of file systems like NFS, NTFS or SMB (Windows) or VMFS (VMware) which are required by the applications. Each storage volume can be treated as an independent disk drive and it can be controlled by external server operating system. Block level storage uses iSCSI and FCoE protocols for data transfer as SCSI commands act as communication interface in between the initiator and the target.
19
UNIT-03/LECTURE 9 [RGPV/Dec2013 (7)] A hypervisor or virtual machine monitor (VMM) is a piece of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor is running one or more virtual machines is defined as a host machine . Each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Multiple instances of a variety of operating systems may share the virtualized hardware resources. The first hypervisors providing full virtualization were the test tool SIMMON and IBM's one-off research CP-40 system, which began production use in January 1967, and became the first version of IBM's CP/CMS operating system. CP-40 ran on a S/360-40 that was modified at the IBM Cambridge Scientific Center to support Dynamic Address Translation, a key feature that allowed virtualization. Prior to this time, computer hardware had only been virtualized enough to allow multiple user applications to run concurrently (see CTSS and IBM M44/44X). With CP40, the hardware's supervisor state was virtualized as well, allowing multiple operating systems to run concurrently in separate virtual machine contexts. Programmers soon re-implemented CP-40 (as CP-67) for the IBM System/360-67, the first production computer-system capable of full virtualization. IBM first shipped this machine in 1966; it included page-translation-table hardware for virtual memory, and other techniques that allowed a full virtualization of all kernel tasks, including I/O and interrupt handling. (Note that its "official" operating system, the ill-fated TSS/360, did not employ full virtualization.) Both CP-40 and CP-67 began production use in 1967. CP/CMS was ava ilable to IBM customers from 1968 to 1972, in source code form without support. CP/CMS formed part of IBM's attempt to build robust time-sharing systems for its mainframe computers. By running multiple operating systems concurren tly, the hypervisor increased system robustness and stability: Even if one operating system crashed, the others would continue working without interruption. Indeed, this even allowed beta or experimental versions of operating systems – or even of new hardware – to be deployed and debugged, without jeopardizing the stable main production system, and without requiring costly additional development systems. IBM announced its System/370 series in 1970 without any virtualization feat ures, but added them in the August 1972 Advanced Function announcement. Virtualization has been featured in all successor systems (all modern-day IBM mainframes, such as the zSeries line, retain backwards-compatibility with the 1960s-era IBM S/360 line) The 1972 announcement also included VM/370, a reimplementation of CP/CMS for the S/370. Unlike CP/CMS, IBM provided support for this version (though it was still distributed in source code form for several releases). VM stands for Virtual Machine, emphasizing that all, and not just some, of the hardware interfaces are virtualized. Both VM and CP/CMS enjoyed early acceptance and rapid development by universities, corporate users, and time-sharing vendors, as well as within IBM. Users played an active role in ongoing development, anticipating trends seen in modern open source projects. However, in a series of disputed and bitter battles, time-sharing lost out to batch processing through IBM political infighting, and VM remained IBM's "other" mainframe operating system for decades, losing to MVS. It enjoyed a resurgence of popularity and support from 2000 as the z/VMproduct, for example as the platform for Linux for zSeries.
20
As mentioned above, the VM control program includes a hypervisor-call handler that intercepts DIAG ("Diagnose") instructions used within a virtual machine. This provides fast-path nonvirtualized execution of file-system access and other operations (DIAG is a model-dependent privileged instruction, not used in normal programming, and thus is not virtualized. It is therefore available for use as a signal to the "host" operating system). When first implemented in CP/CMS release 3.1, this use of DIAG provided an operating system interface that was analogous to the System/360 Supervisor Call instruction (SVC), but that did not require altering or extending the system's virtualization of SVC. In 1985 IBM introduced the PR/SM hypervisor to manage logical partitions (LPAR).
[RGPV/Dec2013 (7)]
Hypervisors are categorized into two types: hosted hypervisor and bare-metal hypervisor. Type 1 (Bare-metal hypervisor): In this type, the hypervisor is directly installed on the x86
21 based hardware. Bare-metal hypervisor has direct access to the hardware resources. Hence, it is more efficient than a hosted hypervisor. Type 2 (Hosted hypervisor): In this type, the hypervisor is installed and run as an application on top of an operating system. Since it is running on an operating system, it supports the broadest range of hardware configurations. A hypervisor is the primary component of virtualization that enables compute system partitioning (i.e. partitioning of CPU and memory). In this course, we will focus on type 1 hypervisors because it is most predominantly used within Virtualized Data Center (VDC). Infrastructure Requirements x86 based operating systems (OS) are designed to run directly on the bare-metal hardware. So, they naturally assume that they fully ‘own’ the compute hardware. As shown in the figure on this slide, the x86 CPU architecture offers four levels of privilege known as Ring 0, 1, 2, and 3 to operating systems and applications to manage access to the compute hardware. While the user-level applications typically run in Ring 3, the operating system needs to have direct access to the hardware and must execute its privileged instructions in Ring 0. Privileged instruction is a class of instructions that usually includes interrupt handling, timer control, and input/output instructions. These instructions can be executed only when the compute is in a special privileged mode, generally available to an operating system, but not to user programs. Virtualizing the x86 architecture requires placing a virtualization layer below the operating system (which expects to be in the most privileged Ring 0) to create and manage the virtual machines that deliver shared resources. Further complicating the situation, some privileged operating system instructions cannot effectively be virtualized because they have different semantics when they are not executed in Ring 0. The difficulty in capturing and translating these privileged instruction requests at runtime was the challenge that originally made x86 architecture virtualization look impossible. The three techniques that now exist for handling privileged instructions to virtualized the CPU on x86 architecture are as follows: 1. Full virtualization using Binary Translation (BT) 2. Operating systems-assisted virtualization or Para virtualization 3. Hardware assisted virtualization S.NO 1
2
RGPV QUESTIONS Describe different types of Hypervisors, with example and block diagram. Also enlist the advantages of virtualization. What is the difference between process virtual machine, host VMMs and native VMMs.
Year Dec 2013
Marks 7
Dec 2013
7
22
UNIT-03/LECTURE 10 Virtual Local Area Networks (VLAN) Virtual LANs can be viewed as a group of devices on different LAN segments which can communicate with each other as if they were all on the same physical LAN segment. Switches using VLANs create the same division of the network into separate domains but do not have the latency problems of a router. Switches are also a more cost-effective solution. By now you are probably wondering why someone would go to all this work to end up with what appears to be the same network as the original one. In many instances, LANs have been grouped with physical location being the primary concern. VLANs are built with traffic patterns in mind. Using VLANs, we are able to group these devices into a single domain. This allows us to confine broadcast traffic for this workgroup to just those devices that need to see it, and reduce traffic to the rest of the network. There is an increased connection speed due to the elimination of latency from router connections. An additional benefit of increased security could be made if you make the decision to not allow access to the host from foreign networks, for example those that originate from another subnet beyond the router. We can now create a network that is independent of physical location and group users into logical workgroups. For instance, if a department has users in three different locations, they can now provide access to servers and printers as if they were all in the same building. Benefits of VLANs Here are the main benefits of using VLANs in your network:
Limit the size of each broadcast domain: This is probably the most important benefit of VLANs. VLANs increase the number of broadcast domains and reduce the size of each broadcast domain. VLANs subdivide a LAN into several logical LANs. Broadcast frames are sent within a LAN. So, if you subdivide the LAN into several smaller VLANs, broadcast frames are only sent within each VLAN. Improve security: VLANs allow network administrators to logically group switch ports for specific purposes. They can assign users (computer hosts) to those VLANs by controlling the port where the computer host connects to the switch. Ports are assigned to the port group (VLAN) where that computer host needs to be. Those ports can be located on any switch in the network: A VLAN can span more than just one switch. This is a very efficient way to control access to network resources. Improve network management and flexibility: Previously you read that VLAN membership allows network administrators to control access to network resources. The nice thing is that they can manage this membership from a single location for all switches in the network. Better yet, by using dynamic VLAN membership, VLAN Trunking Protocol (VTP), and inter-VLAN routing, they can control access to network resources in a very large network with minimum effort. Improve network usage and efficiency: Network administrators assign network
23 resources to a certain VLAN. Only computer hosts and network devices in that VLAN have access to those resources. Marketing server should not be disturbed by requests from engineering computers. By subdividing the network in three VLANs, you ensure that engineering computers will not disturb Marketing servers and vice versa. Virtual storage area network (VSAN) A virtual storage area network (VSAN) is a logical partition in a storage area network (SAN). VSANs allow traffic to be isolated within specific portions of a storage area network. The use of multiple VSANs can make a system easier to configure and scale out. Subscribers can be added or relocated without the need for changing the physical layout. If a problem occurs in one VSAN, that problem can be handled with a minimum of disruption to the rest of the network. Because the independence of VSANs minimizes the total system's vulnerability, security is improved. VSANs also offer the possibility of data redundancy, minimizing the risk of catastrophic data loss. The term is most often associated with Cisco Systems and is often mentioned in conjunction with the zoning. Zoning splits a SAN into multiple, isolated sub networks. The concept behind a VSAN is often compared to that of a virtual local area network (VLAN). VLANs segregate broadcasts from other networks Virtual SAN or virtual fabric is a logical fabric, created on a physical FC SAN. V irtual SAN enables communication among a group of nodes (physical servers and storage systems) with a common set of requirements, regardless of their physical location in the fabric. VSAN conceptually functions in the same way as VLAN. Each VSAN acts as an independent fabric and is managed independently. Each VSAN has its own fabric services (name server, zoning), configuration, and set of FC addresses. Fabric related configurations in one VSAN do not affect the traffic in another VSAN. The events causing traffic disruptions in one VSAN are contained within that VSAN and are not propagated to other VSANs. Similar to VLAN tagging, VSAN has its tagging mechanism. The purpose of VSAN tagging is similar to VLAN tagging in LAN. The diagram displayed on this slide shows the assignment of VSAN ID and frame-forwarding process. Benefits of VSAN 1. Virtual Machine Deployments: Extending VSANs to Virtual Machines and Assigning Different Levels of SAN Services on a per-Virtual Machine Basis. 2. Blade Server Deployments: Using VSANs to Provide Different Levels of SAN Services per Blade Server 3. Consolidation of Fabrics with Different Security Scopes: Providing Security and Isolation on a Single Switch 4. Switch-to-Switch Interoperability: Connecting to Other Vendor's Switches without Merging Fabrics or Losing Capabilities 5. Resilient SAN Extension Solution for Disaster Recovery: Connecting Data Centers without Merging Fabrics (IVR) 6. MAN and WAN Network Consolidation: Consolidating Multiple Types of SANs-Backup, Replication, and FICON-Using VSANs
24 REFERENCE
BOOK
AUTHOR
PRIORITY
Mastering Cloud Computing
Buyya, Selvi
1
Cloud Computing
Kumar Saurabh
2