Author Archives: kp

Technical Procedure To Set Up the Virtual Supercomputer

Hope you have read about the Overview on setting up Virtual Computer Using Boinc.

So now I will explain the technical details behind setting up of a super computer in the academic campus.

There are four main steps involved in this:

  • Setting up a BOINC server.
  • Creating grid of trusted nodes.
  • Setting up volunteer computing segment.
  • Integration and Finalization.

    1) Setting up BOINC server:

    We need a server dedicated to manage the virtual super computer. Intel dual Xeon or AMD Opteron will be a nice choice. Internet connection should be reliable and server must have a static IP. At least 2 GB of RAM, and 40 GB of free disk space, UPS power supply, RAID disk configuration, hot-swappable spares, temperature-controlled machine room, etc and do everything  to make it secure. A midrange server computer like dell poweredge will do. Put the entire system behind a firewall. Switch of ports like ftp and telnet that are not in use.

    Software requirements:

    • VMware Player
    • BOINC Server Virtual Machine

    VMware Player is a freeware virtualization software product from VMware, Inc. (vmware.com). The player can run virtual machines, ie, it will create a virtual environment in the system. For example you can virtually run windows in Linux or vice versa provided you have appropriate virtual machines. You can download the BOINC server virtual machine from boinc.berkely.edu. Download and run the BOINC VM(847MB) in VMware player in the server to get started.

    So now that we have a server with BOINC virtual machine running on it, its time to move on to the grid creation part.

    2) Creating grid of trusted nodes

    Although BOINC was originally designed for volunteer computing, it can be configured to work for grid computing.

    The steps in creating a BOINC-based grid are:

    • Modify preferences of workunit (computation to be performed) from th BOINC server to disable redundant processing. Since a grid will contain only trusted nodes, redundancy is not necessary.
    • Create an account with the general preferences enforced for the desktop grid. Clients can be remotely monitored and controlled if necessary.
    • Configure project to disable account creation. New account creation is for the volunteer computing segment and we do not require it here.
    • Create a custom installer that includes the desired configuration files.
    • Deploy the installer in each system in the lab and other trusted computers.

    So now we have setup each the node in grid segment. Note that our Economic virtual campus supercomputing facility combines the benefits of both Desktop grid computing and volunteer computing. We connect the trusted systems (like lab) to the desktop grid part and other non trusted (student laptops and misc PCs) system to the volunteer computing segment. Now we move to setup the volunteer computing segment.

    3) Creating the volunteer computing segment

    As BOINC is specially designed for volunteer computing, much change is not necessary to be made to BOINC client.

    Following similar procedure setup another custom installer with

    • Account creation enabled
    • Redundancy set up to a desired value
    • Other preference parameters setup to suit specific needs.
    • Ask students and faculty to install this custom client.

    4) Integration and Finalization

    Connect systems to form, desktop grid. Let lab systems be ON whenever computing power is desired. Distribute the volunteer client to all non- trusted units in VCSF ( Eg: Student laptops). Let them connect when they power on their systems. The whole network is connected by wired or Wi-Fi LAN.

    THE CLIENT SIDE

    The volunteers who are ready to contribute to the project should be aware of their CPU usage of BOINC.

    First pic is the screen shot of CPU usage of my system before installing BOINC. The average CPU usage of your computer will be less than 20% approximate in windows vista and less than 5% in windows XP. Since this processor idle time is used for processing supercomputing tasks this will rise up.

    Second shows the CPU usage graph after installing BOINC.  You can notice that the CPU usage rise to 100%. I was contributing my CPU to SETI@home project, the search for extra terrestrial intelligence.

    Setting Up a Virtual Supercomputer Using BOINC

    Campuses have always been the places of innovation. The presence of of a super computing facility in a campus can greatly aid in R&D associated with the campus. The students will get an exposure to super computing arena and they can contribute to indigenous projects.

    But what if we can setup such a facility using the computers already present in the campus with no extra investment??? And what if the  implementation do not induce any bottleneck in the proper functioning of those computers. Well it was a project we where working on..for quiet a long time… So that we can setup a virtual supercomputing facility in a campus in a cost effective way..which uses the unused processing power of all computers present in the campus.

    We have derived the idea of implementation from grid and volunteer computing notions. For those not so tech-savvy people who may stumble across these lines, Grid computing is a variant of distributed computing. Lets say someone has a very complex, resource draining program and a dozen computers. He designs the program in a such a manner that he could divide it into pieces of program each running autonomously in one of the computers and giving the same solution as if we used a very powerful computer with the the high capabilities the program demanded. So GRID computing is called a distributed computing form with loosely coupled (the computers will not have to communicate with each other in solving a problem assigned to them), Heterogenous (computers can be of diffrent forms,using diffrent latforms etc..) and geographically dispersed.

    In volunteer computing any person with his mind bent a bit (towards the side of greater human cause) can donate a part of their pc’s computational power as a service. Many data intensive projects like the SETI@home runs on volunteer computing with people all around the world participating in it.

    So in a campus we have the computers in labs as well as those in the hands of students. Using an Open Source  middle ware called BOINC we can pool the unused processing power of all these computers. Here we will be using something like cycle stealing where idle processor cycles will be nicked from the participating nodes to setup the required virtual super computer.

    The concept in its essence is similar to a volunteer computing project but essentially the BOINC middle ware should be adapted to perform in a smaller arena with the maximum number of nodes being 200 or 300. For this we developed some changes to  the original BOINC as such like the development of a hierarchal tree searching technique, development of an IDS(Intrusion Detection System) etc..

    I have tried to outline the basic concepts of this implementation in a not-so-techie manner. In the next post i will outline the technical procedure to setup the same in a campus and more details on the changes we made to BOINC s/w..

    And for those techie guys who i am sure will be bit disappointed after skimming through this – BOINC stands for Berkeley Open Infrastructure For Network Computing. It is an architecture developed by David Anderson to support GRID based projects.It is available in open source..thanks to those great minds.. And it is this middle-ware which integrates various nodes present in the virtual supercomputing facility, enabling them to interact with each other and manage multiple work modules…

    Technical Procedure To Set Up the Virtual Supercomputer