What is an Element Management System? EMS consists of systems and applications that are concerned with managing network elements (NE) on the network ele ment management layer (NEL) of the Telecommunication Telecommunication Management Network model (TMN) shown below.
As recommended by ITU-T, ITU-T, the Element M anagement System's key functionality is divided into five key areas - Fault, Configuration, Accounting, Performance and Security (FCAPS). Portions of each of the FCAPS functionality fit into the TMN models. On the northbound the EMS interfaces to Network Management Systems and or Service Management Systems depending on the deployment scenario. Southbound the EMS talks to the devices.
What will be the typical feature set for an Element Management System? The typical set of features depends on the equipment and the market it caters to. The following grid exhibits a subset of f unctionality prescribed by ITU-T and Telcordia Telcordia for the key functional areas.
Faul Fa ultt
Conf Co nfig igur urat atio ion n Acc ccou ount ntin ing g Pe Perf rfor orma man nce Sec ecur urit ity y
Alarm handling
Auto Discovery
Service Usage
Alarm correlation
Network provisioning
Service Report level generation agreements
Alarm forwarding
Auto back up and recovery
Billing
Performance monitoring
Prevention
Authentication
Data System collection and Access correlation Control
Filtering and Service Filter Activation management
Detection
Software Log upgrade to management devices
Intrusion recovery
Threshold based reporting
Containment and Recovery
Inventory management
What are the architectural challenges to be kept in mind while building an EMS? And Suggest a suitable architecture for building an Element Management System? In a real world situation, EV's have to cater to different market segments ranging from small Service Providers to large ILECs and the EMS requirements of each of these segments are different. A simple EMS developed for a small Service Provider may not meet the scalability needs of an ILEC, while a full-featured highly available and scalable EMS that is suitable for an ILEC might be overkill for a smaller Service Provider. Provider. Finally developing a new EMS for each segment may not be a practical or a profitable option. When confronted with the above problem, the EV's attempt to solve it by developing a basic EMS initially and hoping that it could be scaled up to mee t the demanding requirements of an ILEC. This strategy definitely makes sense if the EMS is architected properly. From our experience we' ve noticed that in many instances it is not the case because the initial development of the EMS is done at the last moment and in a rushed manner. Therefore the EV's face an uphill task of scaling up the EMS. The key issue in most of these situations is that the EMS was not initially architected to meet the requirements of large Service Providers. It is therefore essential to choose the correct architecture for the EMS so that it could be implemented to meet the requirements of different segments effectively. There are a variety of architectures that could be used to build an EMS. Based on our experience, we recommend that the EMS be developed using JEE architecture. Since JEE based applications could be extended to n-tiers (if required) to meet the scalability needs of the customer, we believe it is a very good fit for EMS development. The figure below shows a sample application based on JEE architecture (for more indepth information on JEE architecture please visit http://java.sun.com/j2ee/appmodel.html) that has five tiers. Tier 1 is the client tier. Tier 2 is the web tier (WT). Tier 3 is the presentation tier (PT). Tier 4 is the application tier (OT) and Tier 5 is the data tier (represented in the figure as the database).
Figure: JEE Architecture It is important to note that boundaries between tiers are logical and is quite easily possible to run all tiers on one and the same (physical) machine. The most important thing is that the system be well structured, and that there is a well-planned definition of the software boundaries between the different tiers.
What are the various advantages offered by JEE Architecture for an EMS Development? JEE architecture offers the following advantages for the development of an EMS: • When there is clear separation of user-interface-control and data presentation from applicationlogic, it enables more clients to have access to a wide variety of server applications. This allows quicker development of EMS application through the reuse of pre-built components and a shorter test phase, because most often the server-components have already been tested. • Re-definition of the storage strategy won't influence the clients. In well designed systems, the client still accesses data over a stable and well designed interface which encapsulates all the storage details. Even radical changes such as switching form an RDBMS to an OODBS, won't influence the client. • Business-objects can place applications-logic or "services" on the net. • As a rule servers are "trusted" systems, data protection and security is simpler to obtain. Therefore it makes sense to run critical business processes that work with security sensitive data, on the server. • The Database & Server tiers can be deployed in a cluster easily enhancing the scalability and availability (a key requirement if the EMS needs to be deployed in large networks). Clusters also allow dynamic load balancing - if bottlenecks in terms of performance occur, occur, the server process can be moved to other servers at runtime. • Change management: of course it's easy - and faster - to exchange a component on the server than to furnish numerous PCs with new program versions. Also by the judicious use of open source technologies (where possible), the EV can reduce the development, maintenance and deployment cost of the EMS significantly while at the same time enjoying the above-mentioned benefits. Finally since the EMS is based on a well-defined and standardized architecture, future development and enhancement of the EMS will be easy and cost
effective.
What are the challenges in developing an EMS for Wir eless Networks? Wireless Networks provide a different set of challenges to an EMS when compared to typical wired networks. The EMS for a wireless network should be able to handle the challenges discussed below. To begin with, management of a wi reless network does not involve in just managing individual Base Stations/Access Points/CPEs in the network. It is equally important to discover and manage the links between the devices and be able to provide a clear picture of the topology of the network. Tra Tracking cking the availability of the Access Points is also essential in being able to have a backup plan - a must for managing wireless networks. In wireless networks there could be different types of links such as active, standby/backup etc., It is important for the EMS to distinguish the different types of links so that they can be portrayed accurately which will be of value to the network administrator. administrator. Another key challenge for the EMS is to track and manage the clients connected to the wireless access points in a network. Client connectivity maps are useful for network planning for a service provider. provider. Since clients attached to an Access Point could be extreme ly dynamic, that is a client could connect to the network from one Access Point and later on connect to the network from another Access Point, an EMS needs to have sufficient intelligence to track and manage the clients efficiently so that it can derive a meaningful usage pattern. Since the throughput depends on the signal strength and interference from other Access Points, Clients get different throughput at different locations. EMS should be able to identify such problematic areas so that the network administrator can try to resolve them. The ability of the EMS to depict the available throughput for each client pictorially in the topology diagram will deliver a lot of value for the network operators as it helps them plan for the throughput requirements. Throughput of a link can be depicted in a variety of ways such as using a color-coding scheme or varying the thickness of the link. Only few wireless networks support full-duplex connectivity. Therefore it will be necessary for the EMS to provide the link health in both directions (upstream and downstream) so that it is easy for the network administrator to monitor the link. Since many wireless networks allow dynamic addition of new Access Points, the EMS should automatically discover new Access Points and be able to provision them as well (using a a predefined policy). This requirement is a must for any EMS that is supporting a dynamic network.
What are the challenges to be kept in mi nd while developing an EMS to manage a device using TL1 protocol? Managing a device via TL1 is quite different when compared to SNMP and it offers its own benefits as well as challenges. • In simple terms, TL1 messages are text messages that have a structure (with some standard delimiters) that could be easily understood by humans. It gives the equipment vendors the flexibility to define their own command-sets provided it follows basic TL1 guidelines. This flexibility makes decoding of TL1 messages quite cumbersome since it involves parsing of a custom command set that has been defined for each device. The EMS therefore should have efficient text processing algorithms to ensure that the me ssages are processed quickly and efficiently. This parsing ability
could limit the processing capabilities of the EMS which in turn directly affects the scalability of the EMS solution. • TL1 protocol works over TCP which mandates a permanent socket connection to the device to receive autonomous messages from the device. (In SNMP, Traps Traps will be sent to port 162 in the manager so it is not necessary to maintain the connection to receive Traps). • The key issue in maintaining the TCP connection for all the devices is the management of these sessions. This involves management of all TCP socket connections and the associated threads which are required to ensure effective processing of the TL1 messages (t hat are received via these sockets) without dropping any of them. Since each TCP connection requires a thread and even though theoretically there is no strict limit on the number of threads that could be supported by an operating system, each thread created will lead to increase in the memory allocated for the process, limiting the scalability (as more devices will lead to more threads leading to out of memory situation quickly). Also most operating systems have some hard limits on the number of open sockets per process. Practically a single EMS server instance may not be able to handle more than 300 – 400 TL1 sessions. • TL1 standard has support for TL1 Gateway, whereby a network element (NE) can act as a gateway for a set of NEs. An EMS can make use of the TL1 Gateway support to provide better scalability, by maintaining session with only one gateway to manage multiple devices. In general EMS solutions supporting TL1 Gateway inherently, deliver the flexibility to scale easily, when compared to others.
What are the common approaches used to build an Element Management System? Once Equipment Vendors decide to develop an EMS for their device, they would like to get it developed as quickly as possible, since most often they start thinking about an EMS when they have to respond to customer RFP's or when they are in "trials" with their customers. Naturally the burning question on every Equipment Vendors mind is how do we get the EMS developed quickly and cost effectively. Following three options are available to developing an EMS. (i) Custom development of EMS (ii) using frameworks to develop an EMS and (iii) using components to develop an EMS. In the first approach a team of 10 people develop an EMS from scratch and typically the development effort lasts for a period of a year [to develop FCAPS (Fault, Configuration, Accounting, Performance and Security) and Topology modules]. The development of EMS could be done in house by the Equipment Vendor or outsourced to another company. company. The advantage of this approach is that since the application is built from scratch it could be highly customized and made to represent the device very well. The drawbacks are obviously cost and time. The second approach is to buy frameworks that are readily available in the market, customize it so that it meets the Equipment Vendor's needs. The frameworks that are available in the market typically have a preset architecture, which cannot be altered. They provide API's using which an additional layer of application could be developed to customize the framework to mee t individual requirements. The benefit of this approach is that the EMS can be developed quickly. quickly. The drawbacks of this approach are that the ability to build a custom application is greatly sacrificed for speed of development. Also since the framework is rigid most of the EMS's developed using this approach tends to have the same look and feel. Finally a key drawback with this approach is that additional functionality cannot be added easily. easily. The third approach is the components based approach. In this case the EMS is architected from scratch but developed using pre-built components. For example, if a fault module needs to be developed then first the architecture of the fault module is defined based on the Equipment Vendor's requirements and then the module is developed quickly by using pre-built fault components such as log management, alarm correlation etc. The advantage of this approach is that it combines the benefits of the custom approach and framework approach. Since the architecture is not pre-defined it can be designed to suit individual requirements. At the same time since the development is done
using pre-built components, the development time is much shorter than custom development. Another key factor is that with this approach additional functionality can be added easily. Below is a graph that compares the three approaches to EMS development.
As you can see with the custom approach there is no pre-built functionality that exists when the development process is started. The addition of functionality varies linearly with time and the target functionality is achieved in time T1. With the components based approach, since the development work is done using pre-built components, some amount of functionality is available at the start of development process and the addition of more functionality varies linearly and the same target functionality is achieved in time T2, with T2 being less than T1. With the framework approach, since pre-built functionality is already integrated into a framework it is more functional to start with. However since it is rigid, more functionality cannot be added easily and hence you see the flattening of the curve thereby taking a longer time to achieve the target functionality. functionality. If we make a reasonable assumption that the project cost is directly proportional to the time it takes to deliver the project (everything else remaining the same), then it can be easily seen that the component based approach in addition to being flexible, delivers the EMS application faster and is more cost effective.
What is the Optimum time to start EMS development? We are glad that you have chosen to have an element management system application for you device...it warms our heart! The time to get started on an EMS depends on several factors while we have listed the important ones, there are many more to consider depending on your particular situation. (i) Device maturity (ii) Customer/Market place requirements (iii) Financial situation (iv) Skill set availability and prioritization As you would have noticed, of the four key factors listed above, three of them (device maturity, financial situation, skill set availability) are internal to the company while customer/market place requirement is external to the company and depends to a very great extent on the market place that you are participating. If you want to be truly a market driven company you will put item (ii) as the most important factor which we will discuss as the last factor. factor. As a first set step let us look at the internal factors of the company over which the company presumably has some control. Based on our experience, in general we recommend that the
customer's start their EMS development when the device development is almost completed and the device is stable. This allows the MIB definition of the device to be complete and the MIBS are also stable – a key requirement for a successful and stable EMS. The second key factor is the current financial situation, since most of the money during the device development phase are earmarked to developing the device (and rightfully so), there may not be enough funds to be allocated for the development of an EMS. This would mean that the development of a full-fledged EMS might have to be postponed for a while till additional funds are obtained. In the mean time the device has to do with a CLI or kraft interface. The third key factor is the skill availability and prioritization. Since the EMS has a different set of drivers from your device, the skills required to do EMS development are different that from your device. If you were to do the EMS development in-house, then the EMS team needs to be staffed up and managed. There is a lead-time involved in putting together a good EMS team which should be taken into account. Managing another team is a load on the engineering management team that needs to be prioritized as well. Lastly the most important of all the factors is the customer/market place requirement. From our experience we've noticed that some markets (especially when selling to larger service providers and carriers) and some products (high end devices and devices that need to be deployed in volume) require an EMS when the device is introduced to the market. Since a full-fledged EMS development takes about 9 to 12 months, it is advisable to understand your market requirements and start the EMS development about a year in advance before you enter the market in a big way. way.
What are Different Approaches in Adding Functionalities to EMS? There are primarily two ways to get your EMS in place. The first approach is to get all the functional modules such as Fault, Configuration, Accounting, Performance, Security and Topology developed in one shot. The second approach is to get the base functionality completed and then add more functionality in phases. In the second approach the functionality most requested by the customers is put in the first phase (say Topology and Fault). In addition, the modules in phase 1 may not have all the features in them as well (i.e. the depth). As you might have readily guessed, the first approach solves primarily for cost and time whereas the second approach solves for cash flow. Getting the EMS developed in one stretch would definitely lower the cost of EMS development and reduce the time taken to develop the EMS. Whereas the second approach is easy on the cash flow. For most startup companies since cash flow is very important (I've heard that "cash flow is the mother's milk of a start up!") it is probably better to do the EMS development incrementally. incrementally. If we were to assume that cash flow is not a problem then the next step would be to take into account other factors such as device maturity and market/customer requirements before deciding on the approach. If the device is well developed and has been stable then the first approach could be considered. Also if the market that you are addressing insists that you have a complete EMS (example ILEC and large Service Providers) then first approach makes sense. However if your device is not fully mature and you are targeting smaller customers who are not very demanding then it may make sense to go with the second option and ge t a basic version of EMS in place and start trials with customers (the trials will also surface many more requirements giving you the opportunity to add features that you think are appropriate to your base EMS). It is important to note that if you choose to go ahead with the incremental approach, then you need to keep an eye on your long term needs and ensure that your choice of the architecture/technology supports the i ncremental approach