[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference nyoss1::market_investing

Title:	Market Investing

Moderator:	2155::michaud

Created:	Thu Jan 23 1992
Last Modified:	Thu Jun 05 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	1060
Total number of notes:	10477

70.0. "Kendall Square Research" by DECEAT::SHAH () Tue Feb 18 1992 21:00

    
    There was an article about Kendall Square Research in the Globe this
    weekend. They intend to go public and offer the stock in the $9-$11
    price range.
    
    Can someone provide some more information than the Globe ? In general,
    how does one go about finding some details about the *NEW* offerings ?
    
    Thanks,
    /Alkesh
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    The article -
    Kendall Square Research - Supercomputer maker plans to tap the stock
    market {The Boston Globe, 15-Feb-92, p. 23} 
    
    Kendall Square Research Corp., a supercomputer maker in Waltham that
    has raised $63 million in private capital since 1986 said yesterday it
    plans to tap the stock market for up to $33 million. The company hopes
    to go public with 3 million shares at an estimated price of $9 to $11
    each. The offering, which requires Securities and Exchange Commission
    approval, would leave about 30% of its outstanding shares in public
    hands. The company's SEC registration shows tat William I. Koch, a
    Boston investor who recently relocated to San Diego to be closer to the
    development of a new high-technology sailboat for the America's Cup
    race, is its major investor. He owns nearly 58% of the company's
    shares, a stake that would be diluted to 40.5% after the initial public
    offering. Other major investors include the Palmer Organization
    Partnerships of Woburn, which would own nearly 6% of the shares after
    the deal; Olivetti Holdings NV, the venture investment arm of the
    Italian computer conglomerate; Sprout Funds; a venture firm; and John
    Hancock Venture Capital Fund. After Koch, Kendall's largest individual
    investor is Henry Burkhardt 3d, who cofounded the company and is its
    president and chief executive. He would own 3.7% of the stock after the
    offering. Burkhardt is best known as co-founder of Data General Corp.
    and Encore Computer. Another early investor is C. Gordon Bell. None of
    the initial investors plans to sell their shares in the offering. A
    Kendall spokeswoman said initial shipments of its first systems, known
    as the KSR1, were made to the Oak Ridge National Laboratory, Cornell
    University and Manchester University in England late last year. The
    company makes massively parallel supercomputers. The company has been
    involved in some controversy based on a recent article in Upside
    Magazine by journalist George Guilder of Tryingham. In praising the
    design of the KSR1, Guilder was critical of KSR's Bay State rival,
    Thinking Machines Corp. of Cambridge. Guilder said Thinking Machines'
    newest computers, the CM-4 employs a less effective massively parallel
    design. Thinking Machines founder Danny Hills sharply disputes
    Guilder's claims, saying the author's conclusion is based on incorrect
    information. Kidder Peabody & Co. is lead underwriter for Kendall
    Square Research's stock offering.

T.R	Title	User	Personal Name	Date	Lines
70.1	Some technical info on KSR	TPSYS::ABBOTT	Robert Abbott	`Mon Feb 24 1992 14:35`	340
	From USENET: Article: 27192 Path: ryn.mro4.dec.com!nntpd.lkg.dec.com!news.crl.dec.com!deccrl!decwrl!uunet!ksr!dean From: dean@ksr.com Newsgroups: comp.arch Subject: Announcing the KSR1 Supercomputer Keywords: KSR Supercomputer Message-ID: <10017@ksr.com> Date: 22 Feb 92 00:14:16 GMT Sender: news@ksr.com Reply-To: ksr-info@ksr.com Lines: 324 Our users, prospects and other interested parties have asked us to post information on the net as a first step towards establishing a forum for sharing information on the KSR1. In 1986 we took on the challenge of building a high performance system that combined the price/performance advantages of multiple CMOS processors, a traditional programming model that would allow users to develop, port and run applications easily, and scalability. This last includes scalability of component technology and processor count within a given generation, and technology scalability across future generations of systems. The KSR1 is now installed and running at customer sites. We have not opened this forum until now because of a commitment to under-promise and over-deliver. So we won't be hyping the airwaves with "fast, faster, fastest" rhetoric. We'll just stick with facts. When we deliver a teraflop it will be both usable, affordable, and part of a successive generation of products. If you don't find that to be a radical idea, we think you will read our description of the KSR1 with interest and enthusiasm. Henry Burkhardt III Chairman, President and Chief Executive Officer Kendall Square Research email: henry@ksr.com KSR1 Computer System The KSR1 is a highly parallel computer system designed to be scalable to thousands of processors while preserving the simplicity and familiarity of a shared memory programming model. Each processor is a RISC-style superscalar 64-bit unit operating at 20 MIPS and 40 MFLOPS (peak). A KSR1 system contains from eight to 1088 processors with a peak performance range from 320 to 43,520 MFLOPS , all sharing a common virtual address space of one million megabytes (240 bytes). KSR1 Software The KSR1 is the first general purpose computer with supercomputer performance and workstation price performance. KSR expects its user community to be performing all the classical scientific calculations, all the typical business functions (e.g., transaction processing, decision support), and all the typical Unix functions (e.g., document preparation, mail) at the same time on the same machine. The design of the system is such that each community will get superb performance and cost effective performance. KSR OS is an extension of OSF/1. As such, it is a very complete implementation of all of Unix. KSR OS is fully compatible with BSD 4.3 which has no official validation suite. In addition, it will pass the validation suites for ATT SVr3 base and kernel extensions, X/Open XPG3, and POSIX. KSR OS does not use any front end machines and there is no distinguished processor. Thus, there are no OS bottlenecks and no reason to limit the traditional Unix flexibility. In particular, KSR OS supports an arbitrarily large number of multi-threaded processes timesharing a large number of processors. This ability to timeshare is crucial in many interactive applications, in which periods of intense computing are followed by human time scale periods of thought. Interactive applications spanning the entire range between state of the art numeric processing guided by a user involving scientific visualization all the way to traditional transaction processing as practiced by banks and airlines are all efficiently and naturally supported in the KSR OS environment. The KSR1 environment is what a sophisticated Unix user would expect to find. At the user interface level, there is X11 and Motif. At the language level, there is Fortran, with automatic parallelization, C (both the ANSI and PCC dialects), and IBM-compatible COBOL. At the database level there is the ORACLE relational database management system (RDBMS), including application development tools. Kendall Square Research is extending ORACLE's features for the parallel environment. At the transaction processing level, there is AT&T's Tuxedo /T and Tuxedo /D, fast non-relational file access methods, and fourth generation languages. For decision support applications, ORACLE for KSR1 will provide automatic parallel processing for complex queries. Kendall Square Research has developed a general purpose technique called Query Decomposition which automatically parallelizes SQL queries generated by ORACLE-based applications. Future third party RDBMS software ported to the KSR1 will also take advantage of the Query Decomposition tool. Parallel Fortran programming on the KSR1 can be fully-automatic, semi-automatic or manual. The parallel programming environment of the KSR1 is based on a proprietary parallel run-time system (PRESTO), that dynamically executes run-time decisions based on compiler-generated or programmer-specified directives. The functioning of the runtime system is one of the keys to KSR's dramatically improved performance. The system dynamically decides the level of resources it will devote to a particular parallel task at runtime based on the amount of calculation required at this particular time and the resources available at that time rather than making a static decision about the resource allocation question at compile time. The result of this policy is that real world problems that have significant variations in their processing requirements can be run together taking advantage of all the cycles on the machine rather than running them one at a time, wasting cycles in those parts of the program that don't exhibit maximum parallelism. While offering a highly parallel applications development environment, Kendall Square Research will make available in 1992 scientific and mathematical subroutine libraries, and important third-party software packages for computational fluid dynamics, quantum chemistry, mathematical algorithms for engineering applications, molecular dynamic modeling for computational chemistry, and finite element analysis for engineering applications. KSR1 Networking KSR1 supports an extensive set of connectivity technology including: - TCP/IP, NFS, DCE, SNA-3270, 3770, LU6.2/PU2.1, ISO/OSI X.25, X.29, X.28, X.3 protocols; - Ethernet, Token Ring, HiPPI, and FDDI transports and; - Industry standard buses, the first of which is VME, to facilitate the integration of third-party communication products. ALLCACHE Memory The KSR1's shared memory programming model is made possible by a new architectural technique called ALLCACHE memory. The KSR1 memory system is designed to do for distributed memory what virtual memory did for hierarchical memory -- it replaces the complexity and rigidity of the physical mechanism with a uniform address space, now shared by a set of processors. System hardware and software maps this space into physical devices. The KSR1 ALLCACHE memory system, achieves this programming simplicity without sacrificing the benefit of distributed memory --scalability -- its performance continues to be good even as the number of processors grows very large. The memory models of today's highly parallel computer architectures raise problems for programmers which are reminiscent of storage management in the 1960s. Twenty-five years ago, storage management via overlay structures was an integral part of the job of writing a program. Necessarily, programmers attacked the task with a static analysis of the memory requirements of a single program. Advances in programming practice and system architectures, however, gradually rendered static storage management infeasible. The goals of machine independence, re-use of modular program elements, and algorithms of high complexity characterized by data structures of widely varying size and shape were inconsistent with static, programmer controlled storage management. In addition, the introduction of system environments in which computers were organized for simultaneous use by several programs made it impossible for the author of a single program to predict accurately the time-varying storage requirements of the entire system. Ultimately, these factors led to the adoption of virtual memory as a near-universal feature of storage management in modern computer architectures. Virtual memory makes storage management dynamic and largely automatic. It permits programmers to write applications with a storage abstraction which is simple and powerful -- a single uniform address space. System hardware and software maps this space into physical devices. Highly parallel computer architectures reprise these early storage management issues with a new twist. All of the highly parallel systems that have been introduced have distributed memories. That is, the physical memory comprises a set of memory units, each connected to a unique processor. The processor-memory pairs are interconnected by a network. Distributed memories have been universal among highly parallel machines because they provide the only known means of providing completely scalable access to memory -- that is, access whose bandwidth increases in direct proportion to the the number of processors. In most of today's parallel systems, the job of managing the movement of codes and data among these distributed memory units belongs to the programmer. The job is similar in style to the task of managing the migration of data back and forth between primary and secondary storage prior to the introduction of virtual memory, but it is much more complex. As before, programmers need to be concerned about exactly what will fit where and what to remove to make room for something new. Now however there are thousands of memory units to deal with instead of just two or three. ALLCACHE memory provides programmers with a uniform 240 byte (million megabyte) address space for instructions and data. This space is called system virtual address space or SVA space. The contents of SVA locations are physically stored in a distributed fashion. ALLCACHE memory physically comprises a set of memory arrays called local caches, each capable of storing 32MB. There is one local cache for each processor in the system. Hardware mechanisms (the search engine described later) cause SVA addresses and their contents to materialize in the local cache of a processor when the address is referenced by that processor. The address and data remain at that local cache until the space is required for something else. As the name suggests, ALLCACHE memory behavior is like that of familiar caches: data moves to the point of reference on demand. However, unlike the typical cache architecture (which we might call SOMECACHE memory), the source for the data which materializes in a local cache is not main memory but rather another local cache. In fact, all of the memory in the machine consists of large, communicating, local caches; the main memory of the machine is identical to the collection of local caches. The address and data that materialize in local cache A in response to a reference by processor A may continue to reside simultaneously in other local caches. Consistency is maintained by distinguishing the type of reference made by processor A: a) If the data in the location will be modified by A, the local cache will receive the one and only instance of an address and its data. b) If the data will be read but not modified by A, the local cache will receive a copy of the address and its data. When processor A first references the address X, the ALLCACHE memory searches that processor's local cache to see if the requested location is already stored there. If not, a hardware search engine locates another local cache (say, local cache B) where the address and data exist. If the processor request being serviced is a read request (for example, to load the value into a register) then the search engine will copy the address and data from local cache B into local cache A. The amount of data copied will be 128 bytes, called a sub-page. At the end of this operation the sub-page will reside at both A and B. If the processor request is a write request (for example, to store the contents of a register into this location) then the search engine will remove the copy of the sub-page from local cache B as well as from any other local caches where it may exist before copying it into local cache A. Thus the search engine is responsible for finding and copying sub-pages stored in local caches and for maintaining consistency by eliminating old copies when new contents are stored. In order to maintain consistency, each local cache records state information about the sub-pages it has stored. These states are specific to the physical instance of a sub-page within a particular local cache. Thus a single sub-page in SVA space may be in Invalid state in one local cache and in Copy state in another. Some sub-page states are used and maintained exclusively by hardware as part of the operation of the search engine. Others can be manipulated indirectly by the operation of software. There are times when two or more processors need to synchronize their access to SVA locations. The ALLCACHE memory supports this requirement through instructions which lock and unlock sub-pages. These instructions can be used to implement any multi-processor synchronization functions including data locks, barriers, critical regions, and condition variables. (All of these forms of synchronization and others are available via KSR compilers, libraries, and OS calls.) A "lock" in ALLCACHE memory is achieved by setting a sub-page to the Atomic state. A program does that by issuing a GET instruction on the address of a byte within the desired sub-page. This instruction will cause the search engine to find the sub-page and -- if the page is not in Atomic state -- return it to the requesting processor in Atomic state. In the process the search engine will ensure that all other copies of the sub-page are set Invalid. If the sub-page is already Atomic it will not be returned to the requestor immediately. Instead the request packet will return to the requestor with an indicator that the sub-page was found in the Atomic state. A program removes Atomic state from a sub-page by issuing the RELEASE instruction. In addition to the basic functional roles of the search engine (finding sub-pages within the set of local caches and maintaining consistency), the search engine must be scalable -- it must be implemented in such a way that good performance continues to be delivered as the number of processors grows. This objective is achieved in the KSR1 by implementing the search engine as a hierarchy. The KSR1 search engine is a two-level hierarchy of uni-directional rings. Each ring is a sequence of point- to-point connections among a set of units, with the last unit in the set being connected back to the first. Each unit is a combination of a router for request/response packets and a directory. The router can move a packet farther along the ring or send it up or down in the hierarchy. All of the units on all rings can operate simultaneously, so the search- engine is a highly parallel mechanism. The lowest level rings are called Search-Engine:0s (or SE:0). Each SE:0 can be configured to contain from eight to 32 processor/local cache pairs. Each processor/local cache pair is connected to exactly one SE:0 via a unit which contains a directory for that local cache. There is one entry in the directory for each page allocated in the local cache. The entry gives the SVA address of the page and the state of each of its sub-pages. When a packet passes such a unit, it can determine whether the subpage the packet is seeking can be found in the desired state in the local cache. If so, the unit routes the packet there, if not it moves the packet on to the next unit on the ring. The unit on a SE:0 which connects upward to the next higher level is called a ALLCACHE Routing and Directory cell or ARD. It contains a directory covering the entire SE:0 -- there is an entry in its directory for every page allocated on every local cache on the ring. When a packet reaches an ARD it will be moved to the next unit on the SE:0 if the directory in the ARD indicates that the data sought is on the SE:0. If not, the packet is routed up to the next higher level in the hierarchy. The ring at the top level of a KSR1 is called Search-Engine:l (or SE:1). SE:l becomes involved in a search operation when a processor requests a sub-page which is stored (for the moment) in a local cache on a different SE:0. A SE:1 can be configured to connect two to 34 SE:0s. Hence the maximum system size in a KSR1 is (32*34) 1088 processors with 34 Gigabytes of ALLCACHE memory. SE:l is composed of ARDs, each containing a directory for the SE:0 to which it is connected. This directory is essentially a duplicate of the one stored in the ARD on the corresponding SE:0. When a packet reaches an ARD on SE:l, it will be moved to the next ARD on the ring if the directory in the ARD indicates that the data sought is not on the corresponding SE:0. Otherwise, the packet is routed down to the ARD on SE:0. In the KSR1 the packet passing speed of an SE:0 is 8 million packets per second. SE:ls can be configured to handle 8, 16, or 32 million packets per second. Each packet contains 128 bytes of data; hence the SE:0 bandwidth is 1 Gigabyte/sec and the SE:1 bandwidth ranges from 1 to 4 Gigabytes/sec. The KSR1 Processor The KSR1 processor is a four chip set implemented in 1.2 micron CMOS. One of these chips, called the Cell Execution Unit or CEU, is the basic control unit of the processor. On each clock cycle it fetches two instructions from memory. Certain instructions (loads, stores, branches, address arithmetic) will be executed directly by the CEU; others will be executed by a co-processor for execution. The CEU is responsible for all instructions dealing with memory. These instructions operate on 40 bit addresses. This design characteristic of the processor architecture is fundamental to the system design. In order to build a shared memory multi-processor with large numbers of processors, a large address is essential, 32 bits is not sufficient to address the amount of memory required. The KSR1 architecture actually envisions a 64 bit address (pointers are stored as 64 bit quantities) but, due to implementation constraints, the first generation address size is 40 bits -- and that is clearly sufficient for 1088 processor systems being built at this time. The CEU has 32 address registers, each 40 bits wide. The CEU operates with three co-processors: FPU (floating point unit) - This chip executes arithmetic operations on IEEE floating point format values. It has 64 registers each 64 bits wide. It supports linked triad instructions in which two floating point operations are initiated from a single instruction, giving a peak floating point rate of 40 MFLOPS. Sustained floating point performance depends on the application, of course. Examples include: 6.6 MFLOPS (Livermore Loops harmonic mean), 15 MFLOPS (100 X 100 Linpack), 28 MFLOPS (FFT), and 32 MFLOPS (Matrix Multiply). IPU (integer and logical operations unit) - This chip performs arithmetic and logical operations on 64 bit integers stored in 32 registers (each 64 bits wide). XIU (I/O channel) - This chip provides a 30 MB/sec pathway to peripheral devices. Since there is an XIU on every cell, large systems can be configured with very high aggregate bandwidth to disk drives and networks. Summary The KSR1 story is primarily software, ease of portability and programmability made possible by our ALLCACHE memory architecture. In order to deliver sequentially consistent shared memory we developed our own CMOS microprocessor; the architecture is embedded in the silicon, taking advantage of very low latency networks and providing a system design that rarely incurs a latency penalty. The ALLCACHE memory system delivers programming simplicity and performance without sacrificing the scalability benefit of distributed memory. email: ksr-info@ksr.com
70.2	Opens 3/23 - 3M shares	DECEAT::SHAH		`Wed Feb 26 1992 12:59`	12
	The issue opens on 3/23 in the $9-$11 range. The initial offering is ~3M shares (at least, maybe more) under the symbol KSRC. The contacta at the Kiter Peabody & Co. are - Bill Dore and Jeffery Hargaton (617) 261-1110. They are going to mail me the prospectus in couple of days. I'll post more information if anyone else is interested. /Alkesh
70.3	Any more information	KEPNUT::MONTAGNA		`Thu Mar 05 1992 13:00`	5
	Any new information pro or con on Kendall Square Reaserch? I heard that there is a great amount interest in the stock this early and that the stock could be over subscribed before the initial offering.
70.4	From tomorrow's VNS	FRITOS::TALCOTT		`Mon Mar 09 1992 18:59`	16
	Kendall Square Research - Plans to go public within a month or so {The Wall Street Journal, 9-Mar-92, p. C6} The $9 to $11 a share price for the 10 million shares would value the company at more than $100 million, or 110 times last year's revenue. KSR has sold three supercomputers. KSR has spent $56.9 million on research and development since its 1986 founding and just sold its first supercomputer in September, reaping revenue of $903,668 for 1991. Losses for the year were $22.5 million, 69% wider than a loss of $13.3 million in 1990. The company has since shipped three more supercomputers, the largest of which lists for $2.2 million, but only two have been accepted - and Kendall can recognize revenue for a shipment only after its been accepted. The company's machines will cost from $50,000 to $30 million, but it's biggest and most impressive - a $30 million system - won;t even be formally introduced until later this year. Kendall, which is entering a highly competitive field, isn't expected to report profits any time soon. The reception for KSR is likely to be a bellwether for other technology companies hoping to go public.
70.5	KSRC to trade this AM	CSSE::POTTER		`Fri Mar 27 1992 12:52`	9
	Kendall Square Research (NASDAQ:KSRC) is supposed to start trading this morning. The IPO price is expected to be on the high end of the $9-$11 range listed in the prospectus. Anybody had any luck getting any shares in the IPO, looks like I got shut out...the underwriter tells me demand is high. fwiw, John
70.6	It was $11. Thats where it opened and closed	VINO::FLEMMING	Have XDELTA, will travel	`Sat Mar 28 1992 07:31`	1