Techstartupplace: 2016

Tuesday, July 26, 2016

Research about Tree Data Structure - Trie

Research Findings:
1. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated.
2. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string
3. A trie can be seen as a tree-shaped deterministic finite automaton. Each finite language is generated by a trie automaton, and each trie can be compressed into a deterministic acyclic finite state automaton.
4. while a balanced binary tree has log2(n) depth, the maximum depth of the trie is equal to the maximum length of a word! (Again, wider but shorter.)

Problems:
Word add and search Probem.

Advantages:
1. It can replace HashTable
2. Burstsort

About Data Structure:
1. Trie, a special tree structure, quite different from binary tree.
2. The tree's root is null. Each TrieNode has hashmap, character, and isEnd property.
3. Each node can have many children.

Problems:
211. Add and Search Word - Data structure design

Names:
Trie
Digital Tree
Radix Tree
Prefix Tree

Type:
Bitwise Trie

Terms:
Dynamic Set
Associative Array

Feelings:
1. node is null. When traversing the tree, use the children/map to test if their child is target node.
2. Also need to use isEnd property to test if this is the end of the word.
3. We can also use Backtracking technique to do pattern search.

References:
https://en.wikipedia.org/wiki/Trie
http://www.geeksforgeeks.org/longest-prefix-matching-a-trie-based-solution-in-java/

Friday, May 27, 2016

Research about Pseudocode

Research Findings:
1. Pseudocode is an informal high-level description of the operating principle of a computer program or other algorithm.
2. There are many opportunities for developers to make mistakes, create unintentional complexity, or simply lose their way. Pseudocode is an incredibly useful tool in the developer's toolbox, helping her avoid many of the pitfalls that plague such a complex undertaking.
3. Developers can look at the whole picture rather than the specific elements that make up that picture. They can be sure they have adequately defined the problem before they get too deep into the actual prototyping phase and realize they have forgotten something important. The prototyping phase can then move along more quickly, simply because there is no need to keep correcting code that has already been written.
4. The output of design is Pseudocode.
5. Pseudocode is logic without syntax. Architect needs to write readable pseudocode
6. A workable solution is a solution that step by step logically considers and eliminates options.
7. Pseudocode helps the developer stop thinking about syntax, start thinking about "is my logic logical? Am I making sense solving this problem? Is this solution step by step ?"
8. Pseudocode bridges the gap between logic and syntax of building perfect statements.
9. Pseudocode does not have to be perfect computer code, but it should make perfect logic.
10. Pseudocode is an informal, high-level description of the operating procedure of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading

Rule:
1. Concentrate on logic first, build perfect statements later. [Golden Rule.]

Benefits:
1. Focus on the big picture, instead of detailed level pointer manipulation.
2. There is no standardised style or format, so one pseudocode may be different from another.
3. To improve concentration, pseudocode helps.
4. Programmer does not have to think about syntax. Programmer can concentrate on logic.
5. Pseudocode can be reviewed by groups more easily than real code.
6. Pseudocode allows programmers who work in different computer language to talk to each other.

Disadvantage:
1. We do not get a picture of the design.

Goals:
1. Very brief,
2. Let other people understand the high level design of the algorithm

Code:
1. Clean
2. Executable and give right results.

Recommendations:
1. For middle and hard level problems, need to write Pseudocode for communication and reduce defects.
2. For very simple problem, don't need to write Pseudocode.
3. I have written some pseudocode to help solve some middle problem, like parsing expression.
4. Only use Pseudocode as high level design. And then write code to implement it.

References:
https://en.wikipedia.org/wiki/Pseudocode
http://www.ehow.com/info_10060077_advantages-using-pseudocode.html
http://www.cs.iit.edu/~cs561/cs105/pseudocode1/sld001.htm
http://formaldesign-techniques.blogspot.com/2009/03/advantages-and-disadvantages-of_20.html
https://docs.google.com/document/d/14x2S42cQ4J20uKRc_2LxrM-u39LE6j3M5FVCdQx5Jg8/edit
https://www.reddit.com/r/programming/comments/1250eg/how_to_crack_the_toughest_coding_interviews_by/

Wednesday, May 25, 2016

Research about Android and IPhone market

Reality:
1. Android is dominating the market, dominating the downloads, but it is losing the app sales, etc.
2. Android 4+ has 40% of the market.

Differences:
1. For releasing an app, Android is better than Apple, etc.

Feelings:
1. You will never know which is winning ultimately IPhone or Android ?

References:
http://www.businessinsider.com/iphone-v-android-market-share-2014-5
http://bgr.com/2014/07/01/android-market-share-2014/
http://www.forbes.com/sites/tonybradley/2013/11/15/android-dominates-market-share-but-apple-makes-all-the-money/
http://techland.time.com/2013/04/16/ios-vs-android/
http://www.inc.com/erik-sherman/should-mobile-tech-startups-focus-on-android-over-ios.html
http://www.inquisitr.com/467457/google-android-4-0-now-controls-40-marketshare/

References in China:
http://chanye.18183.com/201407/140166.html
http://www.cnbeta.com/articles/303821.htm
http://chanye.ptbus.com/151150/
http://www.ali213.net/news/html/2014-7/110648.html
http://mytwoandahalfcents.com/android-app-vs-iphone-app-which-do-you-choose-for-your-startup/

Research about Android File Transfer

Research Findings:
1. USB drivers for Android phones and tablets are needed for connecting our devices to the computer. This includes, using the device for development which requires Android SDK USB drivers, ADB and Fastboot drivers, transferring media content and files to your phone storage etc.
2. Android doesn’t include a file manager app by default.
3. With the Android device in its default MTP mode (PTP is also available, and USB mass storage may be available on older devices), it will appear in your Windows or Linux file manager window as a standard device.
4. Older Android devices support USB mass storage for transferring files back and forth with a computer. Modern Android devices use the MTP or PTP protocols — you can choose which one you prefer.
5. USB mass storage is the standard protocol used by flash drives, external hard drives, SD cards, and other USB storage devices. The drive makes itself completely available to the computer, just as if it were an internal drive.
6. Apple’s Mac OS X is a holdout — it doesn’t include MTP support at all. Apple’s iPod, iPhone, and iPad use their own proprietary syncing protocol along with iTunes, so why would they want to support a competing protocol?
7. Any software program that supports grabbing photos from a digital camera will support grabbing photos from an Android phone when you select the PTP mode. PTP was designed to be a standard protocol for communicating with digital cameras.
8. your Android device will work with digital camera applications that support PTP but not MTP. Apple’s Mac OS X does support PTP, so you can use PTP mode to transfer photos from an Android device to a Mac over a USB connection without any special software.
9. 4GB limit is a filesystem limitation
10. Whether you’re formatting an internal drive, external drive, USB flash drive, or SD card, Windows will give you the choice of NTFS, FAT32, and exFAT.
11. Android is a Linux-based operating system
12. Google’s Android developers could modify the Linux kernel to fit their needs. Linux gives the Android developers a pre-built, already maintained operating system kernel to start with so they don’t have to write their own kernel.
14. The hardware abstraction layer (HAL) defines a standard interface for hardware vendors to implement and allows Android to be agnostic about lower-level driver implementations. The HAL allows you to implement functionality without affecting or modifying the higher level system. HAL implementations are packaged into modules (.so) file and loaded by the Android system at the appropriate time.
15. Android is a mobile operating system (OS) currently developed by Google, based on the Linux kernel and designed primarily for touchscreen mobile devices such as smartphones and tablets.
16. Android has the largest installed base of all operating systems of any kind. Android has been the best selling OS on tablets since 2013, and on smartphones it is dominant by any metric
17. Android was unveiled in 2007, along with the founding of the Open Handset Alliance – a consortium of hardware, software, and telecommunication companies devoted to advancing open standards for mobile devices
18. As of July 2013, the Google Play store has had over one million Android applications ("apps") published, and over 50 billion applications downloaded

Experience:
1. The storage is very limited, constrainted by storage size.

Tools:
File Explorer
Internal SD Card
External SD Card
Android Camera

Storage:
Internal Storage Options

Android Phones:
Sansung Galaxy S6
Android Version 5.1.1 Lollipop

System Architecture:
https://source.android.com/devices/

File Storage:
DCIM
Downloads

Image Transfer:
Preview App
Image Transfer App
FTU
Ext4

Good Tools:
Airdrop (Not Gallery)

Common Issues:
your device is rooted
Device Storage is too small
Hard to transfer media and movies to Computer
Pictures don't show in Image Transfer App
Which part is owned by Hardware Manufactor, Android Framework, and Application Developer ?
Many times doesn't recognize gestures.

References:
https://source.android.com/devices/
http://android.stackexchange.com/questions/48812/bypass-4gb-limitation-of-android-file-transfer
https://en.wikipedia.org/wiki/Android_(operating_system)

Friday, May 13, 2016

Research about API Gateway

Research Findings:
1. The API Gateway service provides API aggregation and transformation, simplifying backend system integration and optimizing mobile content. When you have to connect your mobile app to backend systems, the API Gateway provides a flexible and scalable solution.
2. Services are integrated with Cloud Foundry by implementing a documented API for which the cloud controller is the client; we call this the Service Broker API.
3. The API Gateway is one of the products available on the Pivotal CF Mobile Services. It allows developers to easily create edge gateways proxy calls between devices and your organization internal services.
4. Pivotal’s API Gateway uses javascript as its core language, but not your usual javascript engine. It’s powered by Nashorn, the latest JDK script engine. This approach allows developers to enjoy the benefits of using a simple script language such as javascript, while still being able to seamless integrate with Spring beans. You can invoke any Java Spring bean from your javascript code.
5. Service brokers are an abstraction provided by Cloud Foundry that encourages loose coupling between an application and the service instances it consumes.
6. Service brokers negotiate the details of the relationship between an application and its services, including the service's actual location, the appropriate credentials to use when accessing it, and limits on how much of the service's resources an application can consume.
7. Every service broker follows a similar lifetime, involving separate actions by several different roles. Each of these actions, taken together, create the final service broker instance that application developers can consume
8. Service brokers advertise a catalog of service offerings and service plans, as well as interpreting calls for provision (create), bind, unbind, and deprovision (delete).
9. as companies think about cloud strategies, they need to think about API strategies, too. Secondly, this move by Amazon clearly puts pressure on cloud service providers to offer similar functionality – which Pivotal has already done.
10. When you have a great idea for an API, just like a great blog series, you often need a way to publish, promote, and administer it. This is the fundamental idea behind API management. API gateways on the other hand simply front an existing system with open API formats (like REST and JSON), possibly authorization and authentication semantics, but definitely to make it easier for software to be built on top of an existing system.
11. API gateways are considerably different than API management. Gateways simply expose and handle requests to services; API management provides value-add around that core functionality, like caching, rate-limiting, billing, reporting, key/credential administration, and real-time diagnostics.
12.

Keywords:
API Gateway
Service Broker

Terms:
Org, Space
Service Marketplace

Tools:
Ribbon
Eureka
Circuit Breaker
Zuul
Nashorn

References:
https://docs.pivotal.io/mobile/apigateway/
https://docs.pivotal.io/mobile/apigateway/dashboard-user-guide.html
http://docs.pivotal.io/pivotalcf/services/overview.html
http://docs.cloudfoundry.org/services/overview.html
https://docs.google.com/document/d/15G8ew0qEDqpuBTWH9YGHKhda6HaLvfKuS4pnB-CPm50/mobilebasic
http://www.3scale.net/2015/07/3scale-api-gateway-pivotal-web-services/
http://wwpi.com/aws-launches-api-gateway-to-build-run-scalable-application-backends/
http://www.developereconomics.com/api-management-tools-how-to-find-the-one-for-you/
http://www.slideshare.net/KaiWaehner/a-new-front-for-soa-open-api-and-api-management-as-game-changer
http://blog.smartbear.com/big-data/how-does-amazon-api-gateway-affect-api-management/
https://www.google.com/search?q=pivotal+cloud+foundry+api+gateway&safe=active&espv=2&biw=960&bih=484&site=webhp&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjgiMX08NfKAhVCjXIKHcDfBlIQ_AUICCgD#imgrc=VXVL9nf4v8DXuM%3A
https://www.google.com/search?q=pivotal+cloud+foundry+api+gateway&safe=active&espv=2&biw=960&bih=484&site=webhp&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjgiMX08NfKAhVCjXIKHcDfBlIQ_AUICCgD#imgrc=G3IFfhBIzCzDaM%3A
https://www.google.com/search?q=pivotal+cloud+foundry+api+gateway&safe=active&espv=2&biw=960&bih=484&site=webhp&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjgiMX08NfKAhVCjXIKHcDfBlIQ_AUICCgD#imgrc=O0lIoiUgmh_N1M%3A
https://blog.pivotal.io/pivotal-cloud-foundry/products/an-introduction-to-the-pivotal-cf-mobile-suite-api-gateway
http://apigee.com/about/products/apigee-edge-intelligent-api-management
https://blog.pivotal.io/pivotal-cloud-foundry/products/an-introduction-to-the-pivotal-cf-mobile-suite-api-gateway

https://spring.io/blog/2015/01/28/the-api-gateway-pattern-angular-js-and-spring-security-part-iv

https://docs.pivotal.io/mobile/app_analytics/development-guide.html

Research about Pivotal Cloud Foundry Architecture

Architectures:

Deployment Time:

Centralized Logging:
Centralized Logging Capability

uuid from end to end.

Run Time:

Architectures;

Tuesday, April 26, 2016

Research about Matrix in Maths

Research Findings:
1. Where do matrices come into play? Well, as you know (or maybe not, I don't know) a linear system can be seen in matrix-vector form as
2. Different fields in computer science needs different maths tools to support. Computer Graph needs Matrix. Video games needs Matrix.
3. Matrices are so similar to arrays, in fact, that arrays are typically used to represent matrices in computer programs.
4. Adjcency Matrix is used to model Graph.
5. n-vector, column vector, row vector,
6. zero matrix, diagonal matrix, identity matrix,
7. A, T, I, U, L, P, S

Impact:
1. Geometry
2. Linear Transformation
3. Terms
4. Notations, and Basic Operations

Applications:
1. Represent linear transformations
2. Games

Matrix Multiplication:
https://www.khanacademy.org/math/precalculus/precalc-matrices/matrix_multiplication/v/multiplying-a-matrix-by-a-matrix
https://en.wikipedia.org/wiki/Matrix_multiplication
http://math.stackexchange.com/questions/160328/what-is-the-usefulness-of-matrices
https://en.wikipedia.org/wiki/Applied_mathematics

Tuesday, April 19, 2016

Research about Science

Knowledge Sharing:
1. The scientific method is a body of techniques for investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge.
2. scientific method as "a method or procedure that has characterized natural science since the 17th century, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses."
3. The overall process of the scientific method involves making conjectures (hypotheses), deriving predictions from them as logical consequences, and then carrying out experiments based on those predictions
4. Scientists then test hypotheses by conducting experiments.
5. Under modern interpretations, a scientific hypothesis must be falsifiable, implying that it is possible to identify a possible outcome of an experiment that conflicts with predictions deduced from the hypothesis; otherwise, the hypothesis cannot be meaningfully tested
6. The overall process involves making conjectures (hypotheses), deriving predictions from them as logical consequences, and then carrying out experiments based on those predictions to determine whether the original conjecture was correct.
7. There are difficulties in a formulaic statement of method, however. Though the scientific method is often presented as a fixed sequence of steps, they are better considered as general principles.
8. When applying the scientific method to scientific research, determining a good question can be very difficult and affects the final outcome of the investigation
9. An hypothesis is a conjecture, based on knowledge obtained while formulating the question, that may explain the observed behavior of a part of our universe.
10. An invention is a unique or novel device, method, composition or process
11. The invention process is a process within an overall engineering and product development process.
12. Some inventions can be patented. A patent legally protects the intellectual property rights of the inventor and legally recognizes that a claimed invention is actually an invention.
14. A final point: a scientific hypothesis must be falsifiable, meaning that one can identify a possible outcome of an experiment that conflicts with predictions deduced from the hypothesis; otherwise, it cannot be meaningfully tested.
15. Testing is an investigation of whether the real world behaves as predicted by the hypothesis.
16. Scientists (and other people) test hypotheses by conducting experiments. The purpose of an experiment is to determine whether observations of the real world agree with or conflict with the predictions derived from a hypothesis.
17. Once a hypothesis is strongly supported by evidence, a new question can be asked to provide further insight on the same topic. Evidence from other scientists and experience are frequently incorporated at any stage in the process.
18. Science[nb 1] is a systematic enterprise that builds and organizes knowledge in the form of testable explanations and predictions about the Universe.
19. Scientists perform research toward a more comprehensive understanding of nature, including physical, mathematical and social realms.
20. A computer scientist is a scientist who has acquired knowledge of computer science, the study of the theoretical foundations of information and computation and their application.
21. Computer scientists typically work on the theoretical side of computer systems, as opposed to the hardware side that computer engineers mainly focus on (although there is overlap).
22. Although computer scientists can also focus their work and research on specific areas (such as algorithm and data structure development and design, software engineering, information theory, database theory, computational complexity theory, numerical analysis, programming language theory, computer graphics, and computer vision), their foundation is the theoretical study of computing from which these other fields derive
23. A primary goal of computer scientists is the development (and validation) of models—often mathematical in nature—for estimating the properties of computer-based systems (processors, programs, computers interaction with people, computers interacting with other computers, etc.) with an overarching objective of discovering designs that admit for improved performance (faster, better, cheaper, etc.).
24. A strong aptitude for mathematics is important for a computer scientist.
25. Good communication skills are also important for a computer scientist since a key part of being a good scientist is conveying results for use by others (generally via well-crafted publications and presentations).
26. Additionally, since computer scientists often work in teams on real-world projects, they must be able to communicate effectively with computer personnel, such as programmers and managers, as well as with users or other staff who may have no technical computer background
27. Computer scientists are often hired by software publishing firms, scientific research and development organizations where they develop the theories that allow new technologies to be developed.
28. Computer scientists can also be seen as a type of mathematician, seeing as how much of the field is dependent on mathematics itself.
29.

About Applied Science:
1. Disciplines which use science like engineering and medicine may also be considered to be applied sciences.
2. Applied science is a discipline of science that applies existing scientific knowledge to develop more practical applications, like technology or inventions.
3.

About Technology:
1. Technology ("science of craft") is the collection of techniques, skills, methods and processes used in the production of goods or services or in the accomplishment of objectives, such as scientific investigation.
2. Technology can be the knowledge of techniques, processes, etc. or it can be embedded in machines, computers, devices and factories, which can be operated by individuals without detailed knowledge of the workings of such things.
3.

References:
https://en.wikipedia.org/wiki/Applied_science
https://en.wikipedia.org/wiki/Computer_scientist

References:
https://en.wikipedia.org/wiki/Scientific_method

About Arts:
1. The arts represent an outlet of expression, that is usually influenced by culture and which in turn helps to change culture.
2. Major constituents of the arts include literature – including poetry, novels and short stories, and epics; performing arts – among them music, dance, and theatre; culinary arts such as baking, chocolatiering, and winemaking; media arts like photography and cinematography, and visual arts – including drawing, painting, ceramics, and sculpting.
3. In its most basic abstract definition, art is a documented expression of a sentient being through or on an accessible medium so that anyone can view, hear or experience it.
4. the arts' as "imaginative, creative, and nonscientific branches of knowledge considered collectively, esp. as studied academically
5.

References:
https://en.wikipedia.org/wiki/The_arts

Wednesday, March 30, 2016

Research about Graph Database

Knowledge Sharing:
1. Authorization and access control solutions store information about parties (e.g., administrators) and resources (e.g., employees), together with the rules governing access to those resources. The control system they then apply these rules to determine who can access or manipulate a resource. Access control has traditionally been implemented either using directory services or by building a custom solution inside an application’s backend. Unfortunately, hierarchical directory structures developed on a relational database, suffer join pain as the dataset size grows, becoming slow and unresponsive, and ultimately delivering a poor end-user experience.
2. PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is. The underlying assumption is that more important websites are likely to receive more links from other websites.
3. In graph theory and network analysis, indicators of centrality identify the most important vertices within a graph. Applications include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, and super-spreaders of disease.
4. Social network analysis (SNA) is the process of investigating social structures through the use of network and graph theories.[1] It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties or edges (relationships or interactions) that connect them.
5. Betweenness is a centrality measure of a vertex within a graph (there is also edge betweenness, which is not discussed here). Betweenness centrality quantifies the number of times a node acts as a bridge along the shortest path between two other nodes
6. The Resource Description Framework (RDF) is a family of World Wide Web Consortium (W3C) specifications[1] originally designed as a metadata data model.

7. The RDF data model[2] is similar to classical conceptual modeling approaches such as entity–relationship or class diagrams, as it is based upon the idea of making statements about resources (in particular web resources) in the form of subject–predicate–object expressions.

8. an entity–relationship model (ER model) is a data model for describing the data or information aspects of a business domain or its process requirements,
9. A typical query language is Resource Description Framework (RDF). In a graph-based data model (RDF graph), the syntax structure is a set of triples, each consisting of a subject, a predicate, and an object.
10. A transportation company has hundreds of locomotive assets, and each locomotive might have thousands of parts.
11. Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time.
12. Scaling graph data processing for real time traversals and analytical queries is Titan’s foundational benefit.
14. Titan is a graph database engine. Titan itself is focused on compact graph serialization, rich graph data modeling, and efficient query execution. In addition, Titan utilizes Hadoop for graph analytics and batch graph processing. Titan implements robust, modular interfaces for data persistence, data indexing, and client access. Titan’s modular architecture allows it to interoperate with a wide range of storage, index, and client technologies; it also eases the process of extending Titan to support new ones.
15. Embed Titan inside the application executing Gremlin queries directly against the graph within the same JVM. Query execution, Titan’s caches, and transaction handling all happen in the same JVM as the application while data retrieval from the storage backend may be local or remote.
16. Interact with a local or remote Titan instance by submitting Gremlin queries to the server. Titan natively supports the Gremlin Server component of the Tinkerpop stack.
17. Gremlin is Titan’s query language used to retrieve data from and modify data in the graph. Gremlin is a path-oriented language which succinctly expresses complex graph traversals and mutation operations. Gremlin is a functional language whereby traversal operators are chained together to form path-like expressions. For example, "from Hercules, traverse to his father and then his father’s father and return the grandfather’s name.
18. Gremlin is developed independently from Titan and supported by most graph databases. By building applications on top of Titan through the Gremlin query language users avoid vendor-lock in because their application can be migrated to other graph databases supporting Gremlin.
19. Titan supports two different kinds of indexing to speed up query processing: graph indexes and vertex-centric indexes. Most graph queries start the traversal from a list of vertices or edges that are identified by their properties. Graph indexes make these global retrieval operations efficient on large graphs. Vertex-centric indexes speed up the actual traversal through the graph, in particular when traversing through vertices with many incident edges.
20. Titan stores graphs in adjacency list format which means that a graph is stored as a collection of vertices with their adjacency list. The adjacency list of a vertex contains all of the vertex’s incident edges (and properties).
21. Graph databases are based on graph theory. Graph databases employ nodes, properties, and edges.

Learning:
1. With Graph database, we can ask some intelligent questions and get results from the query, instead of using smart data modeling with fixed schema and attributes, etc.
2. Algorithm First.

Tools:
Neo4j
Titan (Tinkerpop)

Theory:
1. CAP Theory: Consistency, Availability, Partionability

Theory:
Property Graph Model
RDF Graph
DAG

Terms:
Gremlin
RDF triples
GEL

References:
http://www.slideshare.net/verheughe/graph-connect-sf-4-oct-2013
https://www.dama.upc.edu/seminars/2nd-graph-ta/7GraphTAFebruary2014BarcelonaRoarAudunMyrsetandSebastianVerheugheUsingagraphdatabaseforresourceauthorization.pdf
http://neo4j.com/blog/access-control-lists-the-graph-database-way/
http://msdn.microsoft.com/en-us/library/azure/hh974476.aspx
http://neo4j.com/developer/guide-data-modeling/
http://www.slideshare.net/maxdemarzi/introduction-to-graph-databases-12735789
http://www.predictiveanalyticstoday.com/top-graph-databases/
https://en.wikipedia.org/wiki/Graph_database
http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#intro
https://en.wikipedia.org/wiki/Bigtable
http://www.slideshare.net/knowfrominfo/titan-big-graph-data-with-cassandra?next_slideshow=1
http://www.predictiveanalyticstoday.com/top-graph-databases/

Sunday, February 14, 2016

Research about overview of Machine Learning and Data Science

Research Findings:
1. Engineers who are great in both fields are basically unicorns and are at least 10x as valuable as someone who is great in just one of the fields. These are the engineers who don’t just work on algorithms or systems all day but instead launch personalization products in the market. These are the types of engineers who are behind the personalization teams at companies such as Amazon, Netflix, LinkedIn and many successful personalization startups
2. machine learning is both about breadth as depth. You are expected to know the basics of the most important algorithms (see my answer to What are the top 10 data mining or machine learning algorithms?). On the other hand, you are also expected to understand low-level complicated details of algorithms and their implementation details. I think the approach I am describing addresses both these dimensions and I have seen it work.
3. there are lots of folks in the market that are great engineers and there are also lots of folks who are great at machine learning, but there is a severe shortage of great Machine Learning Engineers.
4. Knowing R or Python really well might amount to building a model faster or allow you to integrate it into software better, but it says nothing about your ability choose the right model, or build one that truly speaks to the challenge at hand.
5. The art of being able to do machine learning well comes from seeing the core concepts inside the algorithms and how they overlap with the pain points trying to be addressed. Great practitioners start to see interesting overlaps before ever touching a keyboard.
6. Data science and machine learning are not synonyms, but much of the core algorithmic variety used by data scientists comes from machine learning.
7. R is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.
8. Just when you became an R ninja, Python came around the corner and became the de facto. Just when you finally mastered how to lay out a kick-ass data pipeline using Hadoop, Spark became the new thing.
9. many machine learning jobs look for people that have a PhD in some quantitative discipline. The reason for this is that the PhD is the only degree that trains its students in the discipline of advanced research. Importantly, obtaining a PhD means you have solved something truly original and defended that in front of leaders in the field.
10. Although many vendors would have you believe machine learning can be a "service" or something that is packaged, real machine learning takes real research. If it could be a packaged solution, everyone would do it and there would be no competitive advantage in either academia or enterprise.
11. Thinking like a researcher means working towards something truly original, regardless of how small that piece of originality might be. Don't look for obvious solutions; look for that thing that models the system in an original way and that offers you, your client, or your employer something truly competitive and interesting.
12. Revisit the core concepts regularly, practice solving problems relentlessly, and work towards trying to solve something original.
14. Jump in and have fun. Machine learning is an exciting field and will help solve many of today's most difficult challenges.
15. Add more algorithms (decision trees, neural nets etc.). Learn more domains and problems. Study techniques to solve unstructured data. There are wonderful courses in the thread
16. if you're looking for a slow but mature growth into machine learning, familiarity with basic concepts of computer science (algorithms, data structures, and complexity), mathematical maturity in discrete math, matrix math, and probability and statistics, are all important for understanding the more complex parts of machine learning you can get into
17. Importantly, there are a lot of algorithms/paradigms in machine learning. While you should have some understanding of these, it is equally important to have the basic intuition of machine learning - concepts of bias-variance tradeoff, overfitting, regularization, duality, etc. These concepts are often used in most or all of machine learning, in some form or the other.
18. machine learning is responsible for everything from Siri, Google Now, the recommendation engine on YouTube and Netflix, to even the driverless cars. Surely, machine learning is something that every computer scientist will encounter sooner or later, and it is important to learn this well.
19. Machine learning was the byproduct of the early endeavours to develop artificial intelligence. The aim was to make a machine learn via data. But the use of this approach often resulted in reinventing already existing statistical models. This, coupled with the increase in knowledge and logical based approach to AI put machine learning out of favour among the AI community.
20. Machine learning soon became a sub-sect of statistics and data mining.
21. Machine Learning has become a separate field of its own. Instead of striving to achieve artificial intelligence, the main aim of machine learning has become more towards tackling solvable problems. It borrows techniques from statistics and probability to focus on predictions derived from data.
22. AI includes concepts like symbolic logic, evolutionary algorithms and Bayesian statistics and many other concepts that don’t fall under the purview of machine learning.
23. Apart from machine learning, AI tries to achieve a broad range of goals like Reasoning, Knowledge representation, Automated planning and scheduling, Natural language processing, Computer vision, Robotics and General intelligence.
24. Machine learning on the other hand focusses on solving tangible, domain specific problems through data and self learning algorithms.
25. In your free time, read papers like Google Map-Reduce [34], Google File System [35], Google Big Table [36], The Unreasonable Effectiveness of Data [37],etc There are great free machine learning books online and you should read those also. [38][39][40].
26. “People You May Know” ads achieved a click-through rate 30% higher than the rate obtained by other prompts to visit more pages on the site. They generated millions of new page views. Thanks to this one feature, LinkedIn’s growth trajectory shifted significantly upward.
27. Goldman is a good example of a new key player in organizations: the “data scientist.” It’s a high-ranking professional with the training and curiosity to make discoveries in the world of big data.
28. If your organization stores multiple petabytes of data, if the information most critical to your business resides in forms other than rows and columns of numbers, or if answering your biggest question would involve a “mashup” of several analytical efforts, you’ve got a big data opportunity.
29. If capitalizing on big data depends on hiring scarce data scientists, then the challenge for managers is to learn how to identify that talent, attract it to an enterprise, and make it productive.
30. what data scientists do is make discoveries while swimming in data.
31. But we would say the dominant trait among data scientists is an intense curiosity—a desire to go beneath the surface of a problem, find the questions at its heart, and distill them into a very clear set of hypotheses that can be tested.
32. Some of the best and brightest data scientists are PhDs in esoteric fields like ecology and systems biology.
33. The sexy job in the next 10 years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s
34. There simply aren’t a lot of people with their combination of scientific background and computational and analytical skills.
35. Think of big data as an epic wave gathering now, starting to crest. If you want to catch it, you need people who can surf.
36. a data scientist requires comprehensive mastery of a number of fields, such as software development, data munging, databases, statistics, machine learning and data visualization.
37. A data scientist with a software engineering background might excel at a company like this, where it’s more important that a data scientist make meaningful data-like contributions to the production code and provide basic insights and analyses.
38. Data Scientists in this setting likely focus more on producing great data-driven products than they do answering operational questions for the company. Companies that fall into this group could be consumer-facing companies with massive amounts of data or companies that are offering a data-based service.
39. Google is currently using Machine Learning a lot - in my estimate, over a hundred places in their systems have been replaced by Deep Learning and other ML techniques in the past few years
40. Machine learning uses statistics (mostly inferential statistics) to develop self learning algorithms
41. Artificial Intelligence is a science to develop a system or software to mimic human to respond and behave in a circumstance.
42. Machine Learning is a technology within the sphere of 'Aritifical Intelligence'.
43. The pioneering technology within Machine Learning is the neural network (NN), which mimics (to a very rudimentary level) the pattern recognition abilities of the human brain by processing thousands or even millions of data points. Pattern recognition is pivotal in terms of intelligence.
44.

Companies:
A Data Scientist is a Data Analyst Who Lives in San Francisco
Please Wrangle Our Data
We Are Data. Data Is Us
Reasonably Sized Non-Data Companies Who Are Data-Driven

Academic:
Maths
Statistics
Physics

Topics:
Logistic Regression and Linear Kernel SVMs,
PCA vs. Matrix Factorization,
regularization, or gradient descent.
LibSVM, Weka, ScikitLearn
Big Data

Steps:
1. Andrew Ng's Course
2. Recommender Systems, Mining Massive Datasets
3. Read books
4. Try to apply it in a startup with Python

Algorithms:
L2-regularized Logistic Regression
k-means
LDA (Latent Dirichlet Allocation)
SVMs

Software:
Python Package
R
Matlab

Maths:
Statistics
linear algebra [ feature vectors, eigenvectors, singular value decompositions, and PCA]
probability
optimization
vectors, matrix,
Mutlivariable Calculus

Tools:
Coursera
Open MIT
Standford Courses
Videos in Youtube

Top Academics:
Standford
MIT
CMU
Harward

Jobs:
Data Scientist
PhD

Responsibilities:
Data Analyst
Data Analytics
Data Engineer

Terms:
Artificial Intelligence
Machine Learning
Data Mining
Data Science

Resources:
Books
Articles
Researches

AI and Machine Learning:
Python, R, Java
Probability and Statistics
Applied Maths + Algorithms
Distributed Computing, Data Science, Hadoop, MapReduce
Expert in Unix Tools
Hadoop Subprojects - HBase, Zookeeper, Hive, Mahout
Learn about advanced signal processing techniques

Getting Started:
http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
http://scikit-learn.org/stable/tutorial/basic/tutorial.html
https://www.datacamp.com/courses/kaggle-tutorial-on-machine-learing-the-sinking-of-the-titanic

Course:
https://www.coursera.org/learn/machine-learning/home/week/1

Examples:
LinkedIn

References:
https://www.quora.com/How-do-I-learn-machine-learning-1
https://www.quora.com/What-are-the-top-10-data-mining-or-machine-learning-algorithms/answer/Xavier-Amatriain
http://www.datascienceontology.com/
https://en.wikipedia.org/wiki/R_(programming_language)?q=get%20wiki%20data
https://www.linkedin.com/pulse/20141113191054-103457178-the-only-skill-you-should-be-concerned-with
http://blog.hackerearth.com/2016/01/getting-started-with-machine-learning.html?utm_source=Quora&utm_medium=Content&utm_content=MachineLearning&utm_campaign=BlogQuora
https://www.quora.com/What-is-the-best-language-to-use-while-learning-machine-learning-for-the-first-time/answer/Travis-Addair
https://www.quora.com/What-skills-are-needed-for-machine-learning-jobs/answer/Joseph-Misiti?srid=pBDw&share=4fa777bd
https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1
http://blog.udacity.com/2014/11/data-science-job-skills.html
http://www.stat.cmu.edu/~larry/=sml/

Friday, February 12, 2016

Research about Complexity Theory in Time

Research Findings:
1. In computational complexity theory, a complexity class is a set of problems of related resource-based complexity.
2. A typical complexity class has a definition of the form: the set of problems that can be solved by an abstract machine M using O(f(n)) of resource R, where n is the size of the input.
3. Complexity classes are concerned with the rate of growth of the requirement in resources as the input n increases.
4. Cobham's thesis states that polynomial time is a synonym for "tractable", "feasible", "efficient", or "fast".
5. Any problem that cannot be contained in P is not feasible, but if a real-world problem can be solved by an algorithm existing in P, generally such an algorithm will eventually be discovered.
6. In similar spirit, NC complexity class can be thought to capture problems "effectively solvable" on a parallel computer.
7. Typical input lengths that users and programmers are interested in are approximately between 100 and 1,000,000. Consider an input length of n=100 and a polynomial algorithm whose running time is n2. This is a typical running time for a polynomial algorithm.
8. A typical CPU will be able to do approximately 10^9 operations per second (this is extremely simplified).
9. an algorithm that runs in exponential time might have a running time of 2^n.
10. Mathematically speaking, for big enough inputs, any polynomial time algorithm will beat any exponential time algorithm, and by arbitrarily large amounts.
11. There are many lines of objection to Cobham's thesis. The thesis essentially states that "P" means "easy, fast, and practical," while "not in P" means "hard, slow, and impractical."
12. Mathematical symbols can designate numbers (constants), variables, operations, functions, punctuation, grouping, and other aspects of logical syntax.
14. Problems are said to be tractable if they can be solved in terms of a closed-form expression
15. the class of expressions considered to be analytic expressions tends to be wider than that for closed-form expressions. In particular, special functions such as the Bessel functions and the gamma function are usually allowed, and often so are infinite series and continued fractions. On the other hand, limits in general, and integrals in particular, are typically excluded.
16. If a is an algebraic number different of 0 and 1, and b an irrational algebraic number, then all the values of ab are transcendental numbers (that is, not algebraic).
17. In mathematics, the logarithm is the inverse operation to exponentiation.
18. Trigonometric functions are important in the study of triangles and modeling periodic phenomena, among many other applications.
19. Semantics is the study of meaning. Formal semantics is about attaching meaning to expressions.
20. Algebra studies two main families of equations: polynomial equations and, among them, linear equations. Polynomial equations have the form P(x) = 0, where P is a polynomial. Linear equations have the form a(x) + b = 0, where a is a linear function and b is a vector. To solve them, one uses algorithmic or geometric techniques, coming from linear algebra or mathematical analysis.
21. The P versus NP problem is a major unsolved problem in computer science. Informally speaking, it asks whether every problem whose solution can be quickly verified by a computer can also be quickly solved by a computer
22. When employed in industrial contexts, machine learning methods may be referred to as predictive analytics or predictive modelling.
23. In essence, a Turing machine is imagined to be a simple computer that reads and writes symbols one at a time on an endless tape by strictly following a set of rules.
24. A tuple is a finite ordered list of elements. In mathematics, an n-tuple is a sequence (or ordered list) of n elements, where n is a non-negative integer.
25. Turing machines, first described by Alan Turing in (Turing 1937), are simple abstract computational devices intended to help investigate the extent and limitations of what can be computed.
26. Turing was interested in the question of what it means for a task to be computable, which is one of the foundational questions in the philosophy of computer science. Intuitively a task is computable if it is possible to specify a sequence of instructions which will result in the completion of the task when they are carried out by some machine. Such a set of instructions is called an effective procedure, or algorithm, for the task. The problem with this intuition is that what counts as an effective procedure may depend on the capabilities of the machine used to carry out the instructions. In principle, devices with different capabilities may be able to complete different instruction sets, and therefore may result in different classes of computable tasks
27. Turing proposed a class of devices that came to be known as Turing machines. These devices lead to a formal notion of computation that we will call Turing-computability. A task is Turing computable if it can be carried out by some Turing machine.
28. Turing machines are not physical objects but mathematical ones
29. A Turing machine is a kind of state machine. At any time the machine is in any one of a finite number of states. Instructions for a Turing machine consist in specified conditions under which the machine will transition between one state and another.
30. Even when a problem is decidable and thus computationally solvable in principle, it may not be solvable in practice if the solution requires an inordinate amount of time or memory.
31. In worst-case analysis, the form we consider here, we consider the longest running time of all inputs of a particular length. In average-case analysis, we consider the average of all the running times of inputs of a particular length.
32. In one convenient form of estimation, called asymptotic analysis, we seek to understand the running time of the algorithm when it is run on large inputs. We do so by considering only the highest order term of the expression for the running time of the algorithm, disregarding both the coefficient of that term and any lower order terms, because the highest order term dominates the other terms on large inputs.
33. the expression 0(1) represents a value that is never more than a fixed constant.
34. any language that can be decided in o(n log n) time on a single-tape Turing machine is regular,
35. we exhibited a two-tape TM M3 that decides A in 0(n) time. Hence the time complexity of A on a single-tape TM is 0(n log n) and on a two-tape TM it is 0(n). Note that the complexity of A depends on the model of computation selected.
36. This discussion highlights an important difference between complexity theory and computability theory. In computability theory, the Church-Turing thesis implies that all reasonable models of computation are equivalent-that is, they all decide the same class of languages. In complexity theory, the choice of model affects the time complexity of languages. Languages that are decidable in, say, linear time on one model aren't necessarily decidable in linear time on another.
37. In complexity theory, we classify computational problems according to their time complexity.
38. For our purposes, polynomial differences in running time are considered to be small, whereas exponential differences are considered to be large.
39. Polynomial time algorithms are fast enough for many purposes, but exponential time algorithms rarely are useful.
40. Exponential time algorithms typically arise when we solve problems by exhaustively searching through a space of solutions, called brute-force search. Sometimes, brute-force search may be avoided through a deeper understanding of a problem, which may reveal a polynomial time algorithm of greater utility.
41. All reasonable deterministic computational models are polynomially equivalent.
42. our aim is to present the fundamental properties of computation, rather than properties of Turing machines or any other special model.
43. P is invariant for all models of computation that are polynomially equivalent to the deterministic single-tape Turing machine, and
44. P roughly corresponds to the class of problems that are realistically solvable on a computer.
45. When we present a polynomial time algorithm, we give a high-level description of it without reference to features of a particular computational model. Doing so avoids tedious details of tapes and head motions. We need to follow certain conventions when describing an algorithm so that we can analyze it for polynomiality
46. We describe algorithms with numbered stages. The notion of a stage of an algorithm is analogous to a step of a Turing machine, though of course, implementing one stage of an algorithm on a Turing machine, in general, will require many Turing machine steps.
47. When we analyze an algorithm to show that it runs in polynomial time, we need to do two things. First, we have to give a polynomial upper bound (usually in big-O notation) on the number of stages that the algorithm uses when it runs on an input of length n. Then, we have to examine the individual stages in the description of the algorithm to be sure that each can be implemented in polynomial time on a reasonable deterministic model.
48. When both tasks have been completed, we can conclude that the algorithm runs in polynomial time because we have demonstrated that it runs for a polynomial number of stages, each of which can be done in polynomial time, and the composition of polynomials is a polynomial.
49. We continue to use the angle-bracket notation <.>to indicate a reasonable encoding of one or more objects into a string, without specifying any particular encoding method.
50. we can avoid brute-force search in many problems and obtain polynomial time solutions.
51. However, attempts to avoid brute force in certain other problems, including many interesting and useful ones, haven't been successful, and polynomial time algorithms that solve them aren't known to exist.
52. One remarkable discovery concerning this question shows that the complexities of many problems are linked. A polynomial time algorithm for one such problem can be used to solve an entire class of problems.
53. verifying the existence of a Hamiltonian path may be much easier than determining its existence.
54. a polynomial time algorithm for testing whether a number is prime or composite was discovered, but it is considerably more complicated than the preceding method for verifying compositeness.
55. NP is the class of languages that have polynomial time verifiers.
56. The term NP comes from nondeterministic polynomial time and is derived from an alternative characterization by using nondeterministic polynomial time Turing machines.
57. Problems in NP are sometimes called NP-problems.
58. We show how to convert a polynomial time verifier to an equivalent polynomial time NTM and vice versa. The NTM simulates the verifier by guessing the certificate. The verifier simulates the NTM by using the accepting branch as the certificate.
59. The class NP is insensitive to the choice of reasonable nondeterministic computational model because all such models are polynomially equivalent.
60. When describing and analyzing nondeterministic polynomial time algorithms, we follow the preceding conventions for deterministic polynomial time algorithms. Each stage of a nondeterministic polynomial time algorithm must have an obvious implementation in nondeterministic polynomial time on a reasonable nondeterministic computational model. We analyze the algorithm to show that every branch uses at most polynomially many stages.
61. Verifying that something is not present seems to be more difficult than verifying that it is present.
62. We make a separate complexity class, called coNP, which contains the languages that are complements of languages in NP. We don't know whether coNP is different from NP.
63. As we have been saying, NP is the class of languages that are solvable in polynomial time on a nondeterministic Turing machine, or, equivalently, it is the class of languages whereby membership in the language can be verified in polynomial time. P is the class of languages where membership can be tested in polynomial time.
64. where we loosely refer to polynomial time solvable as solvable "quickly."
65. P = the class of languages for which membership can be decided quickly.
66. NP = the class of languages for which membership can be verified quickly.
67. The power of polynomial verifiability seems to be much greater than that of polynomial decidability. But, hard as it may be to imagine, P and NP could be equal. We are unable to prove the existence of a single language in NP that is not in P
68. The question of whether P = NP is one of the greatest unsolved problems in theoretical computer science and contemporary mathematics.
69. The question of whether P = NP is one of the greatest unsolved problems in theoretical computer science and contemporary mathematics.
70. The best method known for solving languages in NP deterministically uses exponential time.
71. They discovered certain problems in NP whose individual complexity is related to that of the entire class If a polynomial time algorithm exists for any of these problems, all problems in Nl' would be polynomial time solvable. These problems are called NP-complete.
72. On the theoretical side, a researcher trying to show that P is unequal to NP may focus on an NP complete problem. If any problem in NP requires more than polynomial time, an NP-complete one does. Furthermore, a researcher attempting to prove that P equals NP only needs to find a polynomial time algorithm for an NP-complete problem to achieve this goal.
73. Even though we may not have the necessary mathematics to prove that the problem is unsolvable in polynomial time, we believe that P is unequal to NP, so proving that a problem is NP-complete is strong evidence of its nonpolynomiality.
74. A Boolean formula is satisfiable if some assignment of Os and is to the variables makes the formula evaluate to 1.
75. If one language is polynomial time reducible to a language already known to have a polynomial time solution, we obtain a polynomial time solution to the original language,
76. Showing that SAT is in NP is easy, and we do so shortly. The hard part of the proof is showing that any language in NP is polynomial time reducible to SAT.
77. NP-completeness can be proved with a polynomial time reduction from a language that is already known to be NP-complete.
78. If you seek a polynomial time algorithm for a new NP-problem, spending part of your effort attempting to prove it NP-complete is sensible because doing so may prevent you from working to find a polynomial time algorithm that doesn't exist.
79. Our general strategy is to exhibit a polynomial time reduction from 3SAT to the language in question, though we sometimes reduce from other NP-complete languages when that is more convenient.
80. To show that VERTEX-COVER is NP-complete we must show that it is in NP and that all NP-problems are polynomial time reducible to it.
81. Time and space are two of the most important considerations when we seek practical solutions to many computational problems.
82. NP-complete languages as representing the most difficult languages in NP.
83. Complete problems are important because they are examples of the most difficult problems in a complexity class
84. A complete problem is most difficult because any other problem in the class is easily reduced into it, so if we find an easy way to solve the complete problem, we can easily solve all other problems in the class.
85. The reduction must be easy, relative to the complexity of typical problems in the class, for this reasoning to apply. If the reduction itself were difficult to compute, an easy solution to the complete problem wouldn't necessarily yield an easy solution to the problems reducing to it.
86. Whenever we define complete problems for a complexity class, the reduction model must be more limited than the model used for defining the class itself.
87. To show that TQBF is in PSPACE we give a straightforward algorithm that assigns values to the variables and recursively evaluates the truth of the formula for those values. From that information the algorithm can determine the truth of the original quantified formula.
88. To show that every language A in PSPACE reduces to TQBF in polynomial time, we begin with a polynomial space-bounded Turing machine for A. Then we give a polynomial time reduction that maps a string to a quantified Boolean formula X that encodes a simulation of the machine on that input. The formula is true iff the machine accepts.
89. For the purposes of this section, a game is loosely defined to be a competition in which opposing parties attempt to achieve some goal according to prespecified rules.
90. Games appear in many forms, from board games such as chess to economic and war games that model corporate or societal conflict.
91. To illustrate the correspondence between games and quantifiers, we turn to an artificial game called theformula game
92. We say that Player E has a winning strategy for this game. A player has a winning strategy for a game if that player wins when both sides play optimally.
93. To prove that GG is PSPACE-hard, we give a polynomial time reduction from FORMULA-GAME to GG.
94. The only space required by this algorithm is for storing the recursion stack. Each level of the recursion adds a single node to the stack, and at most m levels occur, where m is the number of nodes in G. Hence the algorithm runs in linear space.
95. Sublinear space algorithms allow the computer to manipulate the data without storing all of it in main memory
96. Logarithmic space is just large enough to solve a number of interesting computational problems, and it has attractive mathematical properties such as robustness even when machine model and input encoding method change. Pointers into the input may be represented in logarithmic space, so one way to think about the power of log space algorithms is to consider the power of a fixed number of input pointers.
97. Most computer-aided proofs to date have been implementations of large proofs-by-exhaustion of a mathematical theorem. The idea is to use a computer program to perform lengthy computations, and to provide a proof that the result of these computations implies the given theorem.
98. Boolean formula is satisfiable if some assignment of Os and is to the variables makes the formula evaluate to 1.
99. The satisfiability problem is to test whether a Boolean formula is satisfiable

100. 3cnf-formnula if all the clauses have three literals,

101. In a satisfiable cnf-formula, each clause must contain at least one literal that is assigned 1.
102.

Theoretical CS:
Theory
Definition, Theorems, Lemma, Corollary
Formula
Methodology
Complexity
Topics
Research

Time Complexity: P, NP, NP-Complete, NP-Hardness:
1. P is a complexity class that represents the set of all decision problems that can be solved in polynomial time. That is, given an instance of the problem, the answer yes or no can be decided in polynomial time.
2. Decision problem: A problem with a yes or no answer.
3. NP is a complexity class that represents the set of all decision problems for which the instances where the answer is "yes" have proofs that can be verified in polynomial time.
4. NP-Complete is a complexity class which represents the set of all problems X in NP for which it is possible to reduce any other NP problem Y to X in polynomial time.
5. Intuitively, NP-Hard are the problems that are at least as hard as the NP-complete problems. Note that NP-hard problems do not have to be in NP, and they do not have to be decision problems.

Turing Machine:
1. A Turing machine has an infinite one-dimensional tape divided into cells. Traditionally we think of the tape as being horizontal with the cells arranged in a left-right orientation. The tape has one end, at the left say, and stretches infinitely far to the right. Each cell is able to contain one symbol, either ‘0’ or ‘1’.
2. The machine has a read-write head which is scanning a single cell on the tape. This read-write head can move left and right along the tape to scan successive cells.
3. The action of a Turing machine is determined completely by (1) the current state of the machine (2) the symbol in the cell currently being scanned by the head and (3) a table of transition rules, which serve as the “program” for the machine.
4. if the machine is in state Statecurrent and the cell being scanned contains Symbol then move into state Statenext taking Action
5. ⟨ Statecurrent, Symbol, Statenext, Action ⟩
6. As actions, a Turing machine may either to write a symbol on the tape in the current cell (which we will denote with the symbol in question), or to move the head one cell to the left or right, which we will denote by the symbols « and » respectively.
7. In modern terms, the tape serves as the memory of the machine, while the read-write head is the memory bus through which data is accessed (and updated) by the machine.
8. The first concerns the definition of the machine itself, namely that the machine's tape is infinite in length. This corresponds to an assumption that the memory of the machine is infinite. The second concerns the definition of Turing-computable, namely that a function will be Turing-computable if there exists a set of instructions that will result in a Turing machine computing the function regardless of the amount of time it takes. One can think of this as assuming the availability of infinite time to complete the computation.
9. If the machine reaches a situation in which there is no unique transition rule to be carried out, i.e., there is none or more than one, then the machine halts.
10. In order to speak about a Turing machine that does something useful, we will have to provide an interpretation of the symbols recorded on the tape.
11. a Turing machine is a model of computation that completely captures the notion of computability, while remaining simple to reason about, without all the specific details of your PC's architecture.
12. The (generally accepted) "Church-Turing thesis" asserts that every device or model of computation is no powerful than a Turing machine.
14. So many theoretical problems (e.g. classes like P and NP, the notion of "polynomial-time algorithm", and so on) are formally stated in terms of a Turing machine, although of course they can be adapted to other models as well.
15. people talk about Turing machines because it is a precise and full specified way to say what a "computer" is, without having to describe every detail of the CPU's architecture, its constraints, and so on.
16. To be very technical, there is one key difference between a PC and a Turing Machine: Turing Machines have infinite memory. This means a PC is not (and no tangible device ever could be) technically as powerful as a Turing Machine, although in practice the difference is trivial.
17. A Turing-machine is a theoretical machine that can be used to reason about the limits of computers. Simply put, it is an imaginary computer with infinite memory.
18. We care about Turing-machines because they help us discover what is impossible to accomplish with real computers (like your IBM PC). If it is impossible for a Turing machine to perform a particular computation (like deciding the Halting Problem), then it stands to reason that it is impossible for your IBM PC to perform that same computation
19. Turing machines are basic abstract symbol-manipulating devices which, despite their simplicity, can be adapted to simulate the logic of any computer algorithm. They were described in 1936 by Alan Turing. Turing machines are not intended as a practical computing technology, but a thought experiment about the limits of mechanical computation. Thus they were not actually constructed. Studying their abstract properties yields many insights into computer science and complexity theory.
20. A Turing machine that is able to simulate any other Turing machine is called a Universal Turing machine (UTM, or simply a universal machine). A more mathematically-oriented definition with a similar "universal" nature was introduced by Alonzo Church, whose work on lambda calculus intertwined with Turing's in a formal theory of computation known as the Church-Turing thesis. The thesis states that Turing machines indeed capture the informal notion of effective method in logic and mathematics, and provide a precise definition of an algorithm or 'mechanical procedure'.
21. Using this scheme, we can demonstrate a method for giving evidence that certain problems are computationally hard, even if we are unable to prove that they are.

22. You have several options when you confront a problem that appears to be computationally hard. First, by understanding which aspect of the problem is at the root of the difficulty, you may be able to alter it so that the problem is more easily solvable. Second, you may be able to settle for less than a perfect solution to the problem. In certain cases finding solutions that only approximate the perfect one is relatively easy. Third, some problems are hard only in the worst case situation, but easy most of the time. Depending on the application, you may be satisfied with a procedure that occasionally is slow but usually runs quickly. Finally, you may consider alternative types of computation, such as randomized computation, that can speed up certain tasks.

23. The theories of computability and complexity are closely related. In complexity theory, the objective is to classify problems as easy ones and hard ones, whereas in computability theory the classification of problems is by those that are solvable and those that are not. Computability theory introduces several of the concepts used in complexity theory.
24. Directed graphs are a handy way of depicting binary relations.
25. Strings of characters are fundamental building blocks in computer science.
26. Theorems and proofs are the heart and soul of mathematics and definitions are its spirit.
27. Researchers sometimes work for weeks or even years to find a single proof.

Learning:
1. Scientific Research uses R&D to find hidden theories, which helps human to under some subject further. So these theories are very important, and are the foundation of further research.

Theory:
Savitch's Theorem
Cook-Levin theorem

Term:
Turing Machine
Non-deterministic Turning Machine [NTM]
Deterministic Turning Machine [DTM]
Asymptotic Analysis
Polynomial Bounds
Exponential Bounds
linear time
Polynomial difference
polynomality
nonpolynomaility
Cook-Levin theorem
PSPACE-Hard
Sublinear Space Complexity

Expressions:
Arithmetic Expression
Polynomial Expression
Algebra Expression
Closed-form Expression
Analytic Expression
Mathematical Expression

Books:
Introduction to the Theory of Computation

Questions:
What are the fundamental capabilities and limitations of computers?
What makes some problems computationally hard and others easy?

People:
Turing is widely considered to be the father of theoretical computer science and artificial intelligence.

References:
https://en.wikipedia.org/wiki/Cobham%27s_thesis
https://en.wikipedia.org/wiki/Polynomial#Generalizations_of_polynomials
https://en.wikipedia.org/wiki/Trigonometric_functions
https://en.wikipedia.org/wiki/Millennium_Prize_Problems#P_versus_NP
https://en.wikipedia.org/wiki/Non-deterministic_Turing_machine
http://plato.stanford.edu/entries/turing-machine/
https://en.wikipedia.org/wiki/State_diagram
http://stackoverflow.com/questions/236000/whats-a-turing-machine
https://en.wikipedia.org/wiki/NP-hardness
http://stackoverflow.com/questions/1857244/what-are-the-differences-between-np-np-complete-and-np-hard
https://en.wikipedia.org/wiki/Computer-assisted_proof

Sunday, February 7, 2016

Research about Registered Agent

Research Findings:
1. Commercial law, also known as business law, is the body of law that applies to the rights, relations, and conduct of persons and businesses engaged in commerce, merchandising, trade, and sales
2. In United States business law, a registered agent, also known as a resident agent[1] or statutory agent,[2] is a business or individual designated to receive service of process (SOP) when a business entity is a party in a legal action such as a lawsuit or summons.
3. The registered agent's address may also be where the state sends the paperwork for the periodic renewal of the business entity's charter (if required).
4. The registered agent for a business entity may be an officer or employee of the company, or a third party, such as the organization's lawyer or a service company. Failure to properly maintain a registered agent can affect a company negatively.
5. Most businesses are not individuals but instead business entities such as corporations or limited liability companies (LLCs). This is because there are substantive (and substantial) liability protections as well as tax advantages to being "incorporated" as opposed to being "self-employed".
6. The purpose of a Registered Agent is to provide a legal address (not a P.O. Box) within that jurisdiction where there are persons available during normal business hours to facilitate legal service of process being served in the event of a legal action or lawsuit.
7. Generally, the registered agent is also the person to whom the state government sends all official documents required each year for tax and legal purposes, such as franchise tax notices and annual report forms.
8. Registered Agents generally will also notify business entities if their state government filing status is in "Good Standing" or not.
9. The reason that these notifications are a desired function of a registered agent is that it is difficult for a business entity to keep track of legislative changes and report due dates for multiple jurisdictions given the disparate laws of different states.
10. Penalties for not maintaining a registered agent generally will cause a jurisdiction to revoke a business’s corporate or LLC legal status as well as in some cases, assess additional penalty fees on the entity.
11. This is one of the most common reasons that business entities generally will utilize a third party as their Registered Agent be it a commercial service company, an attorney, or in some cases, a CPA.
12. The person at the business entity that maintains contact with the registered agent is the corporate secretary or governance officer.
14. No matter where you’re starting your business, if you’re forming an LLC or corporation, you’re required to have a registered agent and a registered office.
15. A registered agent is simply a person or entity appointed to accept service of process and official mail on your business’ behalf. You can appoint yourself, or in many states, you can appoint your business to be its own registered agent.
16. A lawsuit against your business cannot move forward in court without your business being properly notified first.
17. The registered agent must have a physical address within that state and be available during business hours so someone suing you can easily find you. This requirement gives the court an easy way to notify you. This also eliminates the possibility of big corporations hiding behind 1000’s of employees. The registered agent is your business’ point of contact with the state and for service of process. The theory behind the requirements of the listed registered agent is to ensure your business maintains a reliable way to be contacted.
18. The majority of small businesses (10 employees or less) do not hire registered agents. That said, there are some specific reasons why some business owners do opt hire registered agent services; you’ll find those reasons listed below:
19. If you hire a registered agent service your registered agent should have a system in place to track and notify you when annual reports are due to keep your business in compliance with the state, so you don’t have to worry about it. Also, all your important documents will be kept in one place and you don’t have to bother keeping track of notices.
20. In each state your business operates, you’ll need to register with that state, and in every state you register, you need a registered agent with a physical location in that state.
21. A registered agent receives important legal and tax documents on behalf of a business, including important mail sent by the state (annual reports or statements), tax documents sent by the state’s department of taxation, and Service of Process—sometimes called Notice of Litigation, which initiates a lawsuit.
22. sometimes a registered agent is called a statutory agent. Or an agent for service of process.
23. Service of process is the procedure by which a party to a lawsuit gives an appropriate notice of initial legal action to another party (such as a defendant), court, or administrative body in an effort to exercise jurisdiction over that person so as to enable that person to respond to the proceeding before the court, body, or other tribunal.
24. In many cases, you or another member of your business, such as a partner, a member of your LLC, or an officer of your corporation, will serve as the registered agent, and the address for the registered agent will be your business location.
25. the top three registered agents by volume in the U.S. are CSC, CT, and Incorp.

Functionalities:
track the official notices and annual report due dates with the state.
Document Organization
Compliance Management
Reminds you of upcoming compliance requirements and deadlines, such as the due date for the annual report required by your state of incorporation.
Provides software to manage important corporate records and documents.
Offers secure, online access to important documents.
Monitors your company's status in the state(s) where it is registered.
online account access to manage notifications

Questions?
1. What is the best practice ?

References:
https://en.wikipedia.org/wiki/Commercial_law
https://en.wikipedia.org/wiki/Registered_agent
http://www.northwestregisteredagent.com/registered-agent-marketshare.html
http://www.tccorporate.com.au/why-tc-corporate
http://www.bizfilings.com/learn/what-is-registered-agent.aspx
https://en.wikipedia.org/wiki/Service_of_process

My Demo Websites