Monday, March 30, 2009

UCS, Clouds and HPC

How many buzz terms can I squeeze into a blog title? I wanted to add a few more, like ARRA but enough is enough :)

Cisco's positioning of UCS is as a complete data center infrastructure. They say that clouds are driving data centers towards UCS. However, it's not really a complete data center infrastructure because it leaves out most of the facility components. That means it is designed to be dropped in place into existing data centers. I don't think that's where clouds are going, in fact, I think they are going to the companies that do the most vertical integration all the way to the power generation facility.

So, Cisco is trying to sell a radically new IT architecture into established IT shops (those that have data centers fully or partially populated now and are looking for incremental expansion or replacement).

Given the pre-recession resource realities (oil at $150/barrel and heading higher) the incremental improvements in operating expense might have been enough to sway these IT shops. But that world is now gone, at least for awhile, it will come back.

So the cloud vendors might use UCS but probably few will because if you want to survive in the cloud market you need to innovate from top to bottom (think power generation, think building construction) and drive costs as low as they can go.

Clouds and HPC have been the subject of a certain amount of academic debate. Most of the naysayers are those that want to have the last ounce of performance, whatever the cost. As you can tell from my previous paragraph that's not in the clouds...(pun intended). So what is in the clouds is that researchers need to learn how to effectively use cloud resources. In one sense, they are already doing that with grids, like the Teragrid. But in the large scale roll out world, I continually run into researchers and small research teams who follow the 'build it yourself' model of HPC and it's close relative - higher an integrator to build it for you. This is the 100 to 1000 'core' market and while some of it needs the last ounce of performance, the fact is that many of these users can't extract the maximum performance from what they have in the first place. Parallel programing for HPC is just too hard or too obscure. And here is where both the solution and the problem comes for cloud vendors. Showing these 1000's of small research teams how to effectively use cloud resources. How to permanently move large data sets into the cloud infrastructure and thus avoid the nasty performance issues (not to mention billing issues) of multi-Terabyte data set access. How to use functional programming to solve their algorithmic problems, such as MapReduce.

If this sounds more like consulting, then you are right, it is more like consulting and less like buying generic off-shelf X86 servers. And that is also a sign of the issues here, because the paradigm of buying generic, off the shelf, PC inspired X86 servers is well cemented into the bulk of the research community that has not yet moved to HPC.

So far, my experience in trying to set up an HPC support team for Bio-Informatics that is software focused has been met with apathy. Nearly everyone want's to talk hardware and which processor are you using? If I say it really doesn't matter as long as you can get the job done in an acceptable time frame at acceptable costs, well they turn away and go back to browsing the hardware vendor web sites. You just gotta have the latest processor, not using Nehalem yet? well get with it, it's so much better.

I do think these things will change and here is why. Scientific research is a competitve arena as much as building and selling products or services is. If cloud computing can deliver the productivity gains that I believe it can for research computing, then those who adopt cloud computing will soon be out-competing those that don't (of course this assumes that at least some of the world renowned researchers who get big grants sign on to this). It also means, that just like in IT, a small startup (think smaller school here) that is funded well enough to land a top scientist, will probably come to the clouds first. That is because they are not hobbled with existing infrastructure, either in data centers or older HPC systems.

The big guys (think large academic research univerisites) will also 'get it' but claim to not see the demand for cloud computing. I suppose that in the more fully de-centralized universities, less funded researchers might see the opportunity and use it to do research that at one time could have only been done at a few high end places......

Hope springs eternal.

Friday, March 20, 2009

RDBMS, Scientific data and Open Source.

OK, I know I recently said I gave up on open source. I must confess that I meant that comment only in the context of US Health Care software, where powerful entities have too much vested in the current state of affairs. When you have a market that is very large, and Health Care software in the US is a plus $80 Billion industry, then it's going to get a lot of attention. But in research, especially academic research, things are different. Since the current focus of a lot of academic health care research is on 'translational' or bench to bedside, i.e. technology transfer, one must wonder if market forces will stand in the way? But in a blog post on the ACM's website, I found another possible outcome - what will be needed for some of the largest scientific problems of our day, simply will not be developed by the market!

It's been a long time since I let my ACM membership lapse. The press of getting things done with commercial systems sort of made most of CACM irrelevant. But in time, good ideas should make their way from the lab to the product, but in this blog entry by Michael Stonebraker, he thinks that won't happen with scientific data management and furthermore he thinks the problem is too big for any one academic institution - hence an organized open-source project.

http://www.cacm.acm.org/blogs/blog-cacm/22489-dbmss-for-science-applications-a-possible-solution/fulltext

For those who don't know Stonebraker, he is one of the "academic to industry" pioneers in RDBMS land, having been behind the creation of Ingress which later morphed into MS SQLServer.

Tuesday, March 17, 2009

MIcrosoft in the Clouds.

Interview with Dan Reed:
http://www.hpcwire.com/features/Twins-Separated-at-Birth-41173917.html?viewAll=y

The scoop from Microsoft:
http://research.microsoft.com/en-us/news/features/ccf-022409.aspx

I confess, I have not digested nor read fully through all of this.
But when Microsoft exec's make statements like this:

"The specific aspect relative to HPC is that cloud services are game changers, just as commodity clusters were a decade ago and graphics accelerators have been recently. This is not the future, this is the present."

Now add in Azure and just about anything written to .NET can be turned into a software as a service.

Other than nagging doubts about security and robustness, which the big guys will just hammer away at with 'look how big we are, we can't be wrong' approaches, there really is the possibility that the PC as we know it has reached the end of it's popular life, much like an Atari Game console or an Eight Track tape. They still work, but who cares?

Monday, March 16, 2009

E-Bay scams

Well, as if having to wait for my money to clear was not bad enough, I now have to wait for my buyer to be re-assured by e-Bay that I am not a fraud before he will pay me. This could take up to 7 days.......
So, here is what happened. I listed an item, it sold. Shortly thereafter, the buyer received an e-mail from someone else claiming to be the buyer. That e-mail looked like this (I have removed the buyers identifing info):

"Hi buyer, My name is Ramsey Connor, I am the seller of the Item#xyz.

Unfortunately, immediately after the listing closed I got a Notification from eBay and I must warn you that my account is unavailable and you cannot pay for the item through Paypal as usual. I sincerely apologize for the inconvenience this has caused and I would be more than willing to help you to complete the transaction immediately. Now I am located in United Kingdom because I have to sign a contract with a company and I know the mainspring of the paypal problem was caused because my account was accessed from different locations, while travelling to UK. Anyway we have a solution, payment approved by eBay for this transaction is Western Union Money Transfer so you can send the payment in a few minutes.

To send a wire transfer you need my full name: RAMSEY CONNOR and country: United Kingdom. You will deduct the transfer fees from the total and you will send the remaining amount. If you want to send the payment with your credit card check the following link for more details:

https://wumt.westernunion.com/WUCOMWEB/staticMid.do?"




So, this URL is embedded and seems to actually go to westernunion.com. Which means someone is actually linked up to a WU account. I can only assume that they are setting up and tearing down accounts rapidly, so that they can keep on doing this scam and WU will know who this account belonged to, at least briefly. Maybe, the account itself is hijacked...But somehwere the money trail via account numbers can be followed and there is a person on the other end.

Social Networking site's all blur together

Ok, so now that I have blogger linked to twitter and then I think I just linked twitter to facebook I am hoping that the twitter like part of facebook will actually come from twitter now!

In researching how to link all these things up, I ran into another strategy besides this feeder strategy that I am using now. That strategy to to signup for a social networking 'mega' site. This is a web mashup or something like a mashup that will consume all your other social networking sites. I don't know, but there is something about these mega sites that reminds of the day's when your browser vendor or search engine vendor wanted to offer you a 'portal'. I finally had to turn all those dang portal pages off, as waiting for the weather in Katmandu to load was plain stupid :)

So, anyway, back to the serial linking excercise. Now the link from blogger to twitter is done via a third party and is using openid. However, the link from facebook to twitter lives inside facebook (they must have a SDK or something) and asks you for your password. Now that is a very bad thing. Facebook, blogger and twitter themselves should all use openid and then we can all stop storing our passwords around the internet with companies large and small, who might someday go out of business and find a going away bonus for someone by selling a password database!

Tuesday, March 10, 2009

The little guys win again

Yeah, we finally had success joining the organization's multi-domain forest from a linux server.  Now the standard SAMBA stuff would probably work as well, but we found a company called LikeWise (likewise.com) that specializes in connecting unixy OS's to Active Directory.  The core component has been open sourced and I know the SAMBA people are looking to add many of it's functions.  It can really replace the stock Winbind and obsolete the idmapper.  It can eliminate our complex configuration of needing to talk to both AD and a separate LDAP server, or to extend the AD schema for unix.

In other words, the LikeWise folks understand that many of us are stuck out in the leaves of the organization and really can't make changes to the core infrastructure.  We have had vendors ask us to change registry settings on the Domain controllers - Ha, fat chance of that happening without a full code review, security analysis and Microsoft's blessing (that's what big organizations do to cover themselves).  

So, there is still a lot of work left to do, but at least we can see light at the end of the Active Directory tunnel.  

Now for the bad news.  We need to switch to linux and abandon opensolaris and ZFS.  I won't be sorry to see opensolaris go, but ZFS is quite nice.  Well, there is one other development that I can't talk about yet, but a white knight might come riding in and take us out of build it yourself mode.

Tuesday, March 3, 2009

Interesting virutalization cost analysis

While I am not going to present the raw data and calculations, they are private property of my employer, I am going to share the rough 'bottom line' from a customer perspective.

Bottom line: Physical server ~$4,000, equivalent virtual server ~$3,000.

(Update: We continue our memory analysis and with the workloads we have - java app servers, web servers, vendor applications that used to one/server - we are seeing with 100 active slices average memory utilization of 12% of our available real memory and peak (several months of data so far) of just under 25% of real memory). That allows us to at least double our estimates of the number of slices - even if I leave a 20% safety margin, our costs can be lowered to about $270/slice/year! That changes the above vm comparison from $3,000 to ~ $2200 )


So two things need to be kept in mind here:

The costs are just about anything that can be identified as devoted to the servers which includes machine room operating costs (these are actuals since we are getting bill's every month), labor and a 5 year equipment depreciation. The calculations are audited by a financial group outside of my organization and they insist on pretty good documentation of the costs.

We are also getting some discounts on the hardware and software, but not so much that a very large ISP couldn't get the same or perhaps do even better.

So, without further ado:

A 8GB quadcore, dual socket server run's us a little under $4100/year to keep running.

A 640MB, single CPU virtual server 'slice' costs a little under $370/year.

Cutting the main server's memory down to about 2 GB, which is what most applications we have seen typically need, will really not change the cost's very much. Say it's goes to $4,000/yr. To make the equivalent of this server in 'slices' would take 3 slices to make up the memory (which only gives us 3 CPU's) or 8 slices to make the CPU's equivlent (which gives us closer to a 4GB server). So let's use 8 as the number. 8 X $370 = $2,960.

So, it's pretty easy to see why virtualization is a big win in operating expense (OPEX). But virtualization has much more going for it than that, labor resources get stretched very much further and limited floor space is much better utilized.....

SUN CIFS update

It's been about two weeks since I last posted. I have written several things up in my head, but the better part of caution has kept me from posting until now.

As you can find out by looking back into past postings, we have been unable to join our SUN CIFS server to our campus/enterprise Active Directory for going on to 6 months now.

To do a sanity check, we decided to attempt a join to our own, test, Active Directory. That works just fine. So we can conclude that the problem lies in the enterprise configuration that we are trying to use.

Let's have a re-cap of the elements of this enterprise configuration, since I believe that many others may face this same environment. The AD we are working with is designed to allow distributed management to various units within the organization. It does this by using a hierarchical implementation which in general jargon is called an OU. So our structure is basically this: O=Corporate_name ---- OU=departments. To compare this to our local AD, we have no OU's and our hierarchy is just one level, the root: O=department.

So, SUN's CIFS server can join at the root level, but not at the OU level.

I did take a look at the various posting threads in SUN's CIFS-discuss list. Beside our trail of woe, I also found a trail of woe from Indiana University.

So, do we have any workarounds? Well, there are two possibilities that spring to mind. They both involve taking our existing, local AD and hooking it up to the enterprise AD. In one method, we can join up as a 'forest' member - but enterprise policies won't allow that. The other method is to set up a one way trust between our local AD and the enterprise AD. That is still under policy discussion, but should be ok since the trust is from us to them and not the other way around, we presumeably could not do anything 'bad' to compromize the enterprise.....

Whether any of these work arounds will actually work is yet to be determined.