August 7, 200520 yr Just wondering what measures have been taken to ensure that what happened a couple weeks ago, won't happen again?Damian stated there was a backup server that was not installed yet, is it installed now? Just making sure our products don't become useless once again.Really eager to hear an update on this.Thanks!
August 8, 200520 yr Hi Paul,For security reasons I will not go into too much detail, but to say that we are working on things. Feel free to write Damian at [email protected]. I am not sure what level of detail he wants to be made public. I hope everyone understands!!Hope this helps,JimActiveSky Sales and Supporthttp://www.hifisim.com/images/asv_dev_team.jpg http://www.hifisim.com/images/asv_proud_supporter.jpg
August 8, 200520 yr Jim, thanks for the reply. I don't think it would do everyone any good, if I wrote Damian myself. I posted here, because it's information we ALL should know.Afterall, when you buy the software, and the servers go down, the software is pretty much useless for Online flying.What Security reasons?
August 8, 200520 yr Hi,Should the sources of the problem also know what is being done? With security issues the whole community should not know the details. Yes, users have the right to know. But I don't feel that non-users should also know. Maybe Damian thinks differently, hence the suggestion to write him.Now, what made the last situation more difficult was the fact that Damian was dealing with a real life situation at the same time. We will have a double server backup in place. We will have others that can reset the server. We will work on a better method of communication between team members in times of difficulties. Those are some of the things we learned from the past and we hope to correct them! Hope this helps,JimActiveSky Sales and Supporthttp://www.hifisim.com/images/asv_dev_team.jpg http://www.hifisim.com/images/asv_proud_supporter.jpg
August 8, 200520 yr >What Security reasons?>If the problems had anything to do with denial of service attacks from the web, then keeping configuration details from the public is a prudent and common practice. Do you advertise the PIN code for your ATM card? As far as I'm concerned, there is no need to worry about new failures that haven't occured. Jim said they've taken measures to address uptime, and the evidence supports him. That's good enough for me, at least until it happens again. Then I would expect additional measures to be taken, which I'm sure they would be. Just my $ .02 as just one of the many happy owners of ASV.
August 9, 200520 yr Commercial Member Firstly I appreciate that things are being done to prevent a repeat of the downtime when the server died. At the time I posted a number of comments based on my own experience running high-availability commercial servers. Having seen this thread I thought I should point out some misconceptions raised in it and again highlight some concerns.Having high availability and reliability of servers is not directly related to having backup servers. It is all about reducing single points of failure. It does not matter how many backup servers you have if they are all connected to the same power source, or all connect to a single network infrastructure, or use a single Internet connection, or a single Internet provider, or can only be reset by a single person, etc.. There are obviously costs associated with finding ways to reduce these single points of failure, and at some point, you balance cost against risk, likelihood and severity of not taking each extra step.How internally the AS developers and server maintainers decide to operate them is for them to decide based on this balance. They do seem to show concern about addressing the hardware availabilty and to provide a high level of service to us users of the product. If they decide not to publicise some of the internal configuration, that again is their choice, but always remember that "security by obscurity" only usually translates to limited security. Also, as at some point there is an IP address for the server(s) used, which anyone with even a limited knowledge and access to simple tools can discover, if there are people seriously intent on disrupting service, not saying anything here isn't much of a hurdle for them to get over.So getting back to the single point of failure issue, what is of more concern to me is the inherent one in the software itself which I think people should be more worried about. This failure point is that provision of up to date information relies totally on being able to communicate with the AS servers.If the AS servers are down for whatever reason (hardware, software, power, personnel, company going out of business, etc.) we have no method to utilise an alternate source of online weather information and this functionality in the product stops working.The VATSIM choice does not talk directly to VATSIM but via the AS servers. You cannot point it at other real-world weather providers such as NOAA or ADDS. There are no toolkits or APIs to allow others to easily develop alternate servers (notwithstanding the fact that AS itself provides no user configuration options to point it at alternate servers).I would therefore like to see similar efforts in improving overall availability of online weather information not just in provision and monitoring of backup servers within the AS team themselves, but also in the area of use of alternate data sources within the product.
August 10, 200520 yr Commercial Member Hi,Thanks for the feedback..The previous downtime was as mentioned due to unconventional attacks on both our server systems, one of which was in process of being activated for redundant client use in case of primary network failure. Since I was pulled away from the "activation" for family medical reasons and was unavailable, the system was unfortunately disabled for about 3 days before I could return and clean things up. No other HiFi member could fix things due to the state of the system and the complexity of the attack.Without needing to go into too much detail, we've spent LOTS of time addressing this issue to prevent problems in the future. The server access routines have been updated. The secondary redundant client access server is now online. The client software has been updated to automatically roll over to alternate server(s) as necessary. We are just now beginning internal beta testing and hope to have this out to everyone very soon.In several years, our server system has only seen this 1 major down time in excess of 4 hours. Prior to that we had maybe 3-4 sparse problems due to network and hardware issues. Compare that to some of the public sources such as NOAA themselves! We are proud of our uptime record and it will only get better... Unfortunately, there is no alternate data source. The HiFi DataNet collection system gets data from over 12 different sources, both public and private. The aloft synthesis and translation of World Forecast Model Data alone requires downloading over 50MB every hour, analyzing it, processing it, and providing meaningful data the client can use. Weather in between actual reporting stations is synthesized based on over 100MB per hour of downstream data amongst the 12+ sources (more being added all the time). In short, AS is build around the HiFi DataNet and cannot use alternate "public sources" without severely limiting the complexity and realism of the weather we are attempting to render.In addition to the recent changes made, we have other things coming, including additional servers, further data system updates, client access enhancements. This is an ongoing effort and we see no end for the next several years! Damian ClarkHiFi Simulation Technologies
Create an account or sign in to comment