As many are aware, there was a major error in this week’s LeTigre Release Channel deploy. Apparently, the root cause of the problem lay in the server-side prim account code, which Simon Linden describes as having “blown up” on the LeTigre RC channel. This resulted in a large number of items (including partial builds) being returned to people’s inventories as a result of regions being seen as “full”. The problem required a two-stage recovery:
- LeTigre regions were rolled back to a state prior to the faulty deployment, and were then updated with the BlueSteel code also deployed on Wednesday October 3rd. This helped to determine the extent of the damage (a total of some 1200 regions)
- The regions damaged by the land impact miscalculation were then restore to a state prior to the roll-out of the original faulty LeTigre code. These had to be restored manually, which took a considerable time
There is further post-mortem work going to to try and discover why this error did not reveal itself when the code deployed to LeTigre was being tested on Aditi, and whether there is anything specific to the regions impacted by the error which may have triggered it. Thought is now also being given to managing large scale region restorations, despite this being the first time there has been such a massive issue of this kind occurring across the grid.
Current RC plans for next week call for the same maintenance release to be made to all three RC channels, which Simon Linden describes as, “Mostly internal changes but [which] does include a minor update for the physics engine library … It’s almost all updating libraries … we’ve been using a fairly old set of compilers and such to make some of the development builds of the servers, and this brings us to more recent code.” Further details on the deploy should be available next week in the Second Life Server section of the Technology forum.
As indicated in part one of this report earlier this week, problems have continued with the Beta viewer code and high crash rates. Work has been ongoing to try and locate the probable cause(s), some of which included the temporary return of tcmalloc. While not actually a cause of the crash issues, having tcmalloc disabled was affecting efforts to reproduce the problems. a beta release was made on the 3rd/4th October (220.127.116.115434), which is proving to be a lot more stable than previous versions, and which happens to have tcmalloc enabled.
The current plan is for a further beta release to be made, most likely on Monday 8th October, which should see tcmalloc turned off once more (if not removed). Should this also prove to be stable, the fixes it contains will be merged back into the development viewer code, and this will clear the way for clearing the backlog of code merges for both the beta and development viewers. It may also see a further 3.4.1 release version of the viewer being made.
Among the projects awaiting merging into the development and beta viewer code are:
- The Steam support changes, which have been available within a development viewer stream, and which are described as “mostly cosmetic”. There is apparently a version of the viewer on Steam, but it is not available for general viewing / download, and is presumably there for testing purposes
- Monty Linden’s HTTP library (texture fetch) code
- Baker Linden’s Group Services project code
- Apple OSX 10.8 Mountain Lion support work, including gatekeeper compatibility
- Bug fixes and further regionalisation work.
Previous plans for these releases called for them to be made under the 3.4.2 code base. While this wasn’t discussed at the TPV/Dev meeting, one assumes this is still the case. However, speaking at the TPV Dev meeting on Friday October 5th, Oz Linden indicated that the order, etc., in which waiting merges will be cleared hasn’t been fully defined, and will be the subject of internal conversations next week at the Lab.
Avatar Baking Project
There is still no major news on this project, although work is continuing both on the viewer and on the server code.
The plan remains to provide TPV developers with access to the viewer code at least 8 weeks ahead of any initial deployment of the server-side code to an Agni release channel. This is to allow TPVs time to merge the code into their viewers and participate in ongoing testing of the new service.There is a possibility that that viewer code will be available sufficiently well ahead of things in order for TPVs to be able to use it alongside the testing on Aditi (beta grid), depending on the status of the beta grid tests and how development of the viewer code progresses.
Please use the page numbers below left to continue reading this report