Murphy’s law and Software Engineering

Murphy’s law[a] is an adage or epigram that is typically stated as: “Anything that can go wrong will go wrong.” In some formulations, it is extended to “Anything that can go wrong will go wrong, and at the worst possible time.”

It’s interesting to explore how does it apply and impact the typical software engineering process. In software development, Murphy’s law highlights a key point: computers do what you tell them to do, not what you want them to do.

Following is the conversation with ChatGPT:

Q: Can you explain the Murphy’s Law impact to software engineering

Certainly! Murphy’s Law, famously stated as “Anything that can go wrong, will go wrong,” has a significant impact on software engineering. It serves as a cautionary principle, reminding software engineers and developers to anticipate and prepare for potential failures during the development process. Here’s how Murphy’s Law influences software engineering:

  1. Emphasis on Thorough Testing: It encourages developers to rigorously test software for not only expected use cases but also for unexpected and edge cases. This thorough testing helps in identifying and fixing bugs early in the development cycle, reducing the risk of major failures post-deployment.
  2. Robust Design: Software engineers are prompted to design systems that are robust and can handle failures gracefully. This might include implementing fail-safes, redundancy, and error handling mechanisms to ensure that the system remains functional even under adverse conditions.
  3. Preventive Programming: The principle pushes developers to write code that is not just functional but also safe against misuse or unforeseen situations. This includes input validation, proper exception handling, and defensive coding practices.
  4. Disaster Recovery and Contingency Planning: Murphy’s Law emphasizes the importance of having disaster recovery plans and contingency measures in place. This ensures that the system can be quickly restored to operation in case of a failure, minimizing downtime.
  5. Continuous Integration and Deployment: It advocates for continuous integration and deployment practices where code changes are frequently and automatically tested and deployed. This helps in identifying integration issues early and reduces the risk of failures in production.
  6. User-Centric Design: It reminds developers to consider how real users might interact with the software in unexpected ways, leading to a design that’s more user-friendly and less prone to misuse.

In summary, Murphy’s Law in software engineering is about expecting the unexpected and being prepared for it. It drives a culture of quality, thoroughness, and resilience, helping create software that’s not only functional but also reliable and robust against unforeseen issues.

Q: Talk a little bit more about Emphasis on Thorough Testing

Certainly! Emphasis on thorough testing is a critical aspect of software development, particularly under the guidance of Murphy’s Law, which suggests that anything that can go wrong, will go wrong. In the context of software engineering, this emphasizes the need for comprehensive testing strategies to identify and rectify potential issues before they escalate. Here’s more on the importance and implementation of thorough testing:

Importance of Thorough Testing

  1. Early Bug Detection: The earlier a bug is detected in the software development lifecycle, the cheaper and easier it is to fix. Thorough testing helps catch bugs at the earliest possible stage.
  2. Preventing Future Issues: By testing all aspects of the software, including edge cases, you can prevent potential issues that might not be immediately apparent.
  3. Quality Assurance: Comprehensive testing ensures that the software meets all the specified requirements and functions as expected, which is crucial for user satisfaction and trust.
  4. Risk Mitigation: Testing helps in identifying vulnerabilities and security risks, thereby allowing developers to address these issues before they can be exploited in a production environment.

Implementation of Thorough Testing

  1. Unit Testing: Testing individual components or units of the software to ensure that each part functions correctly in isolation.
  2. Integration Testing: After unit testing, integration testing ensures that different units or components of the software work well together.
  3. System Testing: This tests the complete and integrated software system to verify that it meets all specified requirements.
  4. Stress and Load Testing: Testing how the software behaves under high loads or stress conditions, which is crucial for understanding its performance limits and scalability.
  5. User Acceptance Testing (UAT): Often the final phase, where the software is tested in the ‘real world’ by the intended audience to ensure it meets their needs and expectations.
  6. Automated Testing: Implementing automated tests can significantly increase the scope and frequency of testing, making it more efficient and thorough.
  7. Continuous Testing: As part of Continuous Integration/Continuous Deployment (CI/CD) pipelines, continuous testing involves automated tests being run whenever changes are made to the codebase.
  8. Regression Testing: Whenever new features are added or existing ones are modified, regression testing ensures that these changes haven’t adversely affected the existing functionality.

In conclusion, thorough testing in software engineering is about ensuring reliability, functionality, and security of the software product. It is an ongoing process that requires attention to detail and a deep understanding of both the software being developed and the environment in which it will operate. By prioritizing thorough testing, software teams can significantly reduce the risk of failures and improve the overall quality of their products.

Q: How about User-Centric Design

User-Centric Design, also known as User-Driven or Human-Centered Design, is another important aspect in software engineering, particularly in the context of Murphy’s Law. It focuses on designing software with the end user’s needs, preferences, and behaviors as the primary concern. This approach can greatly reduce the likelihood of problems and enhance the overall user experience. Here’s how User-Centric Design plays a role:

Importance of User-Centric Design

  1. Enhanced Usability: By focusing on the users, the design is more likely to be intuitive and user-friendly, reducing user errors and improving satisfaction.
  2. Increased Adoption and Engagement: Software that meets user needs and expectations is more likely to be adopted and used effectively.
  3. Reduced Development Redo: Understanding user needs from the start can prevent costly redesigns or modifications after deployment.
  4. Feedback Loop: Regular user feedback helps in iteratively refining the product, ensuring it remains aligned with user needs and expectations.

Implementing User-Centric Design

  1. User Research: Understanding who the users are, what they need, what they value, their abilities, and their limitations.
  2. User Personas: Creating detailed profiles of typical users to help guide design decisions.
  3. Usability Testing: Testing the product with real users to observe how they use it and where they encounter problems.
  4. User Interface (UI) Design: Designing interfaces that are intuitive and easy to navigate.
  5. User Experience (UX) Design: Ensuring the overall experience of using the software is positive and meets the users’ needs.
  6. Accessibility: Making sure the software is accessible to users with disabilities, thereby broadening its usability.
  7. Iterative Design Process: Continually testing and refining the product based on user feedback and changing requirements.

Relation to Murphy’s Law

In the context of Murphy’s Law (“Anything that can go wrong, will go wrong”), User-Centric Design is crucial. By understanding and designing for the end user, many potential issues can be anticipated and avoided. For example, a user interface that is confusing or difficult to navigate can lead to errors that might not have been considered if the design was purely function-driven. By placing the user at the center of the development process, the likelihood of unforeseen problems can be significantly reduced.

In summary, User-Centric Design in software engineering helps in creating products that are not only functional but also delightful to use, thereby reducing the risk of problems arising from poor user experience or misalignment with user needs. This approach is essential in mitigating the risks highlighted by Murphy’s Law, ensuring that the software is robust not just in functionality but also in usability.

Other laws [Reference 1] to pay attention to:

  • Brooks’ law: “Adding manpower to a late project makes it later.”
  • Conway’s law: “Any piece of software reflects the organisational communication structure that produced it.”
  • Hofstadter’s law: “It always takes longer than you expect. (Even when you factor in Hofstadter’s law.)”
  • Linus’ law: “Given enough eyeballs, all bugs are shallow.”
  • Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure.”
  • Gall’s law: “A complex system that works has evolved from a simple system that worked. A complex system built from scratch won’t work.”
  • Zawinski’s Law: “Every program attempts to expand until it can read mail. Those programs which cannot expand are replaced by ones that can.”
  • Eagleson’s law: “Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else.”
  • Lubarsky’s law: “There’s always one more bug.”

There is alway one more thing: Pareto Principle (80/20 Rule).

Reference:

  1. https://productcoalition.com/ten-laws-of-software-development-cbd72db0f85c

What is Merchant of Record (MoR)

A merchant of record (MoR) is the entity that is authorized, and held liable, by a financial institution to process a consumer’s credit and debit card transactions. The MoR is also the name that appears on the consumer’s credit card statement. The merchant of record is responsible for maintaining a merchant account, processing all payments, and managing all credit card processing fees. The MoR also ensures compliance with the PCI-DSS, stays up to date on any laws where the transactions are taking place, and handles any chargebacks. [1]

As a payment service provider (PSP), many fintech company provide payment solutions for online and offline solutions to process payment. Some PSP will act as the MoR for your business, and some will ask you to become a MoR through application process.

As you can see, it’s quite challenging to become MoR and hold those financial responsibility, especially for SMB. But if your business is mature and expanding, MoR has it’s flexibility and benefit. Let’s look at some popular FinTech payment facilitator (PayFac):

MoRNot MoR
PayFacSquare, Stripe, PayPalAdyen
References:
  1. https://squareup.com/us/en/townsquare/merchant-of-record
  2. https://www.pymnts.com/news/retail/2017/payfacs-versus-merchants-of-record-who-will-win/

Data Model for HBase

In HBase, data is stored in tables, which have rows and columns. This is a terminology overlap with relational databases (RDBMSs), but this is not a helpful analogy. Instead, it can be helpful to think of an HBase table as a multi-dimensional map.

An HBase table consists of multiple rows. A row in HBase consists of a row key and one or more columns with values associated with them. Rows are sorted alphabetically by the row key as they are stored. For this reason, the design of the row key is very important. The goal is to store data in such a way that related rows are near each other. A common row key pattern is a website domain. If your row keys are domains, you should probably store them in reverse (org.apache.www, org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each other in the table, rather than being spread out based on the first letter of the subdomain. A column in HBase consists of a column family and a column qualifier, which are delimited by a : (colon) character. Column families physically colocate a set of columns and their values, often for performance reasons. Each column family has a set of storage properties, such as whether its values should be cached in memory, how its data is compressed or its row keys are encoded, and others. Each row in a table has the same column families, though a given row might not store anything in a given column family. A column qualifier is added to a column family to provide the index for a given piece of data. Given a column family content, a column qualifier might be content:html, and another might be content:pdf. Though column families are fixed at table creation, column qualifiers are mutable and may differ greatly between rows. A cell is a combination of row, column family, and column qualifier, and contains a value and a timestamp, which represents the value’s version. A timestamp is written alongside each value, and is the identifier for a given version of a value. By default, the timestamp represents the time on the RegionServer when the data was written, but you can specify a different timestamp value when you put data into the cell.

The data in the table is presented in JSON format:

Reference: https://hbase.apache.org/book.html#datamodel

NoSQL Design Principle

NoSQL design requires a different mindset than RDBMS design. For an RDBMS, you can go ahead and create a normalized data model without thinking about access patterns. You can then extend it later when new questions and query requirements arise. You can organize each type of data into its own table.

  • By contrast, you shouldn’t start designing your schema for NoSQL until you know the questions it will need to answer. Understanding the business problems and the application use cases up front is essential.
  • You should maintain as few tables as possible in a NoSQL application. Having fewer tables keeps things more scalable, requires less permissions management, and reduces overhead for your application. It can also help keep backup costs lower overall.

So we normally start design RDBMS by drawing the Entity-Relation diagram. This can support various flexible queries without significant change the data model.

In NOSQL, we need to optimize specific queries for high performance and availability. What’s the business queries or access patterns:

My experience is to start from RDBMS first then switch to NoSQL when it is needed. If you plan to migrate existing data storage solution, make sure to evaluate different NoSQL solutions first.

  • RDBMS
    • SQL
    • Oracle
  • NoSQL
    • Key-Value
      • Dynamo
    • Column Family
      • HBase
      • Cassandra
      • Big Table
    • Document Oriented
      • MongoDB

Reference: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html

From Developer to Architect

What are the key areas architect should focus:

  • Performance
  • Scalability
  • Reliability
  • Security
  • Deployment
  • Cost
  • Technology Stack

AWS Well-Architected

  • Operational Excellence Pillar: The operational excellence pillar focuses on running and monitoring systems, and continually improving processes and procedures.
  • Performance Efficiency Pillar: The performance efficiency pillar focuses on structured and streamlined allocation of IT and computing resources.
  • Security Pillar: The security pillar focuses on protecting information and systems.
  • Cost Optimization Pillar: The cost optimization pillar focuses on avoiding unnecessary costs.
  • Reliability Pillar: The reliability pillar focuses on workloads performing their intended functions and how to recover quickly from failure to meet demands.
  • Sustainability Pillar: The sustainability pillar focuses on minimizing the environmental impacts of running cloud workloads.

Credit to: Software Architecture & Technology of Large-Scale Systems

https://aws.amazon.com/architecture/well-architected

Idempotency Key for API Design

Image this scenario: you (caller) are integrating with a payment provider like Stripe (callee), when making API call to process a credit card transaction. For some reason, the response does’t come back or you receive some error message. Several possibilities: does the callee received the API call? does the callee process the API call? does the callee return the API call? does the caller received the API response?

One best practice is to implement some retry logic for such critical API call after the timeout. So you send the same API call again, what will happen? That depends! But the key requirement here is to avoid charging the same credit card more than once. We can’t prevent the caller sending multiple same API calls, so we need to prevent it from callee side. Idempotency Key is one common way to handle this.

The example from Stripe is like this:

while sending the same Idempotency key in the request, the callee can detect this is a new request or a duplicated one. One thing critical here is: “Stripe’s idempotency works by saving the resulting status code and body of the first request made for any given idempotency key, regardless of whether it succeeded or failed. Subsequent requests with the same key return the same result, including 500 errors.”

This can get tricky when the callee has dependency with other internal or external service. For example, you call Stripe to process a credit card payment, Stripe call the credit card network. In such caller-callee API chain, the errors can propagate and cumulate.

DynamoDB Time to Live (TTL)

Store a client side session is very common in client server architecture. Another example is in a hotel booking system. You want to maintain a booking time window like 10 mins. To implement this, it is quite simple in AWS DynamoDB.

Create a table with following fields: the partition key (UserName) and the sort key (SessionId). Additional attributes like CreationTime, and ExpirationTime track the session information. The ExpirationTime attribute will be set as the TTL attribute on the table later.

The TTL attribute must:
– The item must contain the attribute specified when TTL was enabled on the table.
– The TTL attribute’s value must be a top-level Number data type.
– The TTL attribute’s value must be a timestamp in Unix epoch time format in seconds.
– The TTL attribute value must be a date timestamp with an expiration of no more than five years in the past.

Enable the TTL on the table: click the table, under the “Additional settings”:

You can find the delete event under the Monitor or CloudWatch.

Posted in AWS

Resolve error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed

When I run command “composer diagnose“, the error:14090086:SSL routines:ssl3_get_server_certificate:certificate verify failed

To fix it, follow these steps:

  1. download http://curl.haxx.se/ca/cacert.pem and save it to somewhere and remember the path, for example ~/cacert.pem
  2. find php.ini via command: php –ini (it is /usr/local/etc/php/7.1/php.ini for me.)
  3. edit the php.ini file, and uncomment or modify following two lines (they are at different sections:[curl] and [openssl])
    • curl.cainfo = ~/cert.pem

    • openssl.cafile = ~/cert.pem

  4. verify result via command: php -r “print_r(openssl_get_cert_locations());”
Posted in PHP

Hadoop Ecosystem

7 years ago, the Hadoop ecosystem was under rapid development. Now lots of projects are mature enough and ready for production deployment.

hadoop-ecosystem
Credit to Mercy (Ponnupandy) Beckham

Here is my personal pick for you to get start your Hadoop journey.

“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.”

The project includes these modules:

Hadoop Common: The common utilities that support the other Hadoop modules.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop YARN: A framework for job scheduling and cluster resource management.
Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.

  • YARN (Distributed Resource Management): Part of the core Hadoop project, YARN is the architectural center of Hadoop that allows multiple data processing engines such as interactive SQL, real-time streaming, data science and batch processing to handle data stored in a single platform, unlocking an entirely new approach to analytics. YARN is the foundation of the new generation of Hadoop and is enabling organizations everywhere to realize a modern data architecture.
  • Spark (Distributed Programming): Apache Spark™ is a fast and general engine for large-scale data processing. Spark provides a simple and expressive programming model that supports a wide range of applications, including ETL, machine learning, stream processing, and graph computation.
  • Tez (Distributed Programming): Apache™ Tez is an extensible framework for building high performance batch and interactive data processing applications, coordinated by YARN in Apache Hadoop. Tez improves the MapReduce paradigm by dramatically improving its speed, while maintaining MapReduce’s ability to scale to petabytes of data. Important Hadoop ecosystem projects like Apache Hive and Apache Pig use Apache Tez, as do a growing number of third party data access applications developed for the broader Hadoop ecosystem.
  • Hive (SQL-On-Hadoop): The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive.
  • Hbase (Column Data Model NoSQL): Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.
  • Cassandra (Column Data Model NoSQL): The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.Cassandra’s support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.
  • MongoDB (Document Data Model NoSQL): MongoDB is a document database with the scalability and flexibility that you want with the querying and indexing that you need. MongoDB stores data in flexible, JSON-like documents.
  • Redis (Key-Value Data Model NoSQL): Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
  • Flume (Data Ingestion): Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.
  • Sqoop (Data Ingestion): Apache Sqoop™ is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
  • Kafka (Data Ingestion): Kafka™ is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.
  • Thrift (Service Programming): The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.
  • ZooKeeper (Service Programming): A high-performance coordination service for distributed applications.
  • Mahout (Machine Learning): The Apache Mahout™ project’s goal is to build an environment for quickly creating scalable performant machine learning applications.
  • Oozie (Scheduling): Oozie is a workflow scheduler system to manage Apache Hadoop jobs.

Reference:

  1. http://hadoopecosystemtable.github.io
  2. https://mydataexperiments.com/2017/04/11/hadoop-ecosystem-a-quick-glance/